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Abstract In this paper we formulate and study a 
variational problem in the space of functions of bounded 
Hessian BH(f2). This forms a higher order extension 
of the well known ROF functional (total variation 
minimisation) which is widely used for image recon- 
struction. Our functional involves convex functions of 
the total variation and the total variation of the first 
derivatives. We prove existence and uniqueness of min- 
imisers via the method of relaxation. We use the split 
Bregman method in order to solve numerically the 
corresponding discretised problem and we prove some 
convergence results. We apply our model to image de- 
noising (Gaussian and impulse noise) and deblurring 
as well as image inpainting. The numerical results 
show that our model avoids the creation of undesir- 
able artifacts and blocky-like structures in the recon- 
structed images which is a disadvantage of the ROF 
model. 
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Bregman • Denoising • Deblurring • Inpainting • 
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1 Introduction 

We consider the following general framework of a com- 
bined first and second order non-smooth regularisa- 
tion procedure. For a given datum u G L s (f2), Q C 
W 1 , s = 1,2, we compute a regularised reconstruction 
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u as a minimiser of a combined first and second order 
functional H(u). More precisely, we are interested in 
solving 



nin \h(u) = - 



(uq — Tu) s dx + a /(Vu) dx 



P f g{V 2 u) dx], 



(1.1) 



for s G {1,2}, non negative regularisation parameters 
a, j3 , with convex functions / : K 2 — > R + , g : R 4 — > 
M + with at most linear growth at infinity, and a suit- 
able linear operator T, see Section [3] for details. By 
combining nonlinear first and second order dynamics 
in the regulariser this approach is able to balance the 
preservation of jumps (taken care of by the first order 
term in the regulariser) against the smoothness within 
homogeneous image parts (modeled by the linearly in- 
creasing penalisation of the Hessian V 2 u). 

We prove the well-posedness of ( |1.1[ ) via the method 
of relaxation and discuss its relation to some higher 
order approaches in the literature. Its numerical solu- 
tion is presented for the case f(x) = \x\, g{x) = \x\ 
and f{x) = \x\\, g(x) = \x\\ using a splitting tech- 
nique and its application for image denoising, deblur- 
ring and inpainting is discussed. 



1.1 Context 

A general inverse problem in imaging reads as follows. 
Observing or measuring data Uq G % in a suitable 
Hilbert space of real functions defined on a domain 
fi, we seek for the original or reconstructed image u 
that fulfils the model 



Uq = TU, 



(1.2) 



where T is a linear operator in H, i.e., T G C(H). The 
operator T is the forward operator of this problem. 
Examples are blurring operators (in which case Tu 
denotes the convolution of u with a blurring kernel), 
T = T the Fourier transform or T — V n a projection 



operator onto a subspace of T~L (i.e., only a few samples 
of u are given). 

To reconstruct u one has to invert the operator T. 
This is not always possible since in many applications 
this inversion is ill-posed and further complicated by 
interferences like noise. In this case a common proce- 
dure in inverse problems is to add a-priori information 
to the model, which in general is given by a certain 
regularity assumption on the image u. Hence, instead 



of solving (1.2) one computes u as a minimiser of 



J(u) = D (u , Tu) + aip(u), 

defined in a suitable Banach space H^. Here, ip models 
the regularity assumption on u with a certain regular- 
ity parameter a > and is called the regulariser of 
the problem, and I? is a distance function in %. The 
latter depends on the statistics of the data uq, which 
can be either estimated or are known from the physics 
behind the acquisition of uq. For normally distributed 
ito, i.e., the interferences in it are Gaussian noise, this 
distance function is the squared L 2 norm of u — Tu. 
For the choice of the regulariser tp squared Hilbert 
space norms have a long tradition in inverse problems. 
The most prominent example is H 1 regularisation 



1, 

mm , 
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u 



Tu\ 2 



L 2 {Q) 



IVul 



(1.3) 



see also [53 , 46 . For T = Id this procedure is equiva- 
lent to Gaussian filtering of the given image uq, where 
the variance of the Gaussian kernel is related to the 
choice of a. The result of such a regularisation tech- 
nique is a linearly, i.e., isotropically, smoothed image 
u, for which the smoothing strength does not depend 
on Uq. Hence, while eliminating the disruptive noise 
in the given data uq also prominent structures like 
edges in the reconstructed image are blurred. This 
observation gave way to a new class of non-smooth 
norm regularisers, which aim to eliminate noise and 
smooth the image in homogeneous areas in the image, 
while preserving the relevant structures such as object 



boundaries and edges. More precisely, instead of (1.3) 



one considers the following functional over the space 



J(u) 



Uq 



Tu \\ 2 L 2 (f2) 



/(Vu) dx, 



(1.4) 



where / is a function from K to M. + with at most 
linear growth, see [IS]. As stated in (1.4) the min- 



imisation of J over W 1 ' 1 {Q) is not well-posed in gen- 
eral. For this reason relaxation procedures are applied, 
which embed the optimisation for J into the opti- 
misation for its lower semicontinuous envelope within 
the larger space of functions of bounded variation, see 
Section [2] The most famous example in image pro- 
cessing is f(s) — \s\, which for T = Id results in 
the so-called ROF model [2T] . In this case the relaxed 



model, where ||Vu||£i(j2) is replaced by the total vari- 
ation \Du\(f2) and J is minimised over the space of 
functions of bounded variation. Other examples for 
/ are regularised versions of the total variation like 
/(s) = V s 2 + e 2 for a positive e < 1 [I1I28] and alike 
pT l [32 l fT3 l l37] . The consideration of such regularised 
versions of |Vm| is sometimes necessary for the numer- 
ical solution of (1.4) e.g., by means of time-stepping 



gS] or multigrid-methods [49,3(11147]. 

As these and many more contributions in the im- 
age processing community have proven, this new non- 
smooth regularisation procedure indeed results into 
a nonlinear smoothing of the image, smoothing more 
in homogeneous areas of the image domain and pre- 
serving characteristic structures such as edges. In par- 
ticular, since the total variation regulariser is tuned 
towards the preservation of edges, this image model 
is performing very well if the reconstructed image is 
piecewise constant. The drawback of such a regulari- 
sation procedure becomes apparent as soon as images 
or signals (in ID) are considered which not only con- 
sist of flat regions and jumps, but also possess slanted 
regions, i.e., piecewise linear parts. The effect of total 
variation regularisation in this case is so-called stair- 
casing. Roughly this means that the total variation 
regularisation of a noisy linear function u in one di- 
mension is a staircase it, whose L 2 norm is close to Uq, 
see Figure [T] In one dimension this effect has been rig- 
orously studied in [2D]. In two dimensions this effect 
results in blocky-like images, see Figure [2] 

This undesirable feature of the total variation reg- 
ularisers gave reason to proposing image enhancement 
methods which involve higher order derivatives in the 
regulariser. Especially in recent years, higher order 
versions of non-smooth image enhancing methods have 
been considered. 

In this paper we introduce the general higher or- 



der model (1.1), which both subsumes many exist- 



ing higher order approaches (see sections 1.2||1.3 ) and 
through its generality gives rise to new models in this 
context. As such, its mathematical and numerical dis- 
cussion is novel and now presented in this paper. We 
prove existence and (under certain conditions) unique- 



ness of minimisers of ( 1.1 ) by the method of relaxation 



and present its numerical solution for image denoising, 
deblurring and inpainting by a splitting technique. In 



particular, we discuss the application of (1.1) to the 



formulation of (1.4) is the total variation denoising 



removal of additive and impulse noise as well as image 
deblurring and inpainting problems. 



1.2 Related work 

Already in the pioneering paper of Chambolle and Li- 
ons [13] the authors propose a higher order method 
by means of an inf-convolution of two convex regu- 
larisers. Here, a noisy image is decomposed into three 
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(a) Noisy, piecewise linear signal 



(b) First order total variation denoised signal 



Fig. 1 Illustration of the staircasing effect in one space dimension 




Fig. 2 First order image denoising and the staircasing effect: (1.) noisy image, (m.) denoised image, (r.) detail of the bottom 
right hand corner of the denoised image to visualise the staircasing effect (the creation of blocky-like patterns due to the 
first order regularise!') 



parts uq = U\ + u 2 + n by solving 



mm 



{ 2 I|U>! 



u 2 



2 



a / |Vui| dx 



|V 2 M2 



I dx\, 



(1.5) 



where S7 2 u 2 is the distributional Hessian of u 2 . Then, 
U\ is the piecewise constant part of uq, u 2 the piece- 
wise smooth part and n the noise (or texture). Along 
these lines a modified infimal-convolution approach 
has been recently proposed in the discrete setting in 
[44,45 . Another attempt to combine first and sec- 
ond regularisation originates from Chan, Marquina, 
and Mulet [15j . who consider total variation minimi- 
sation together with weighted versions of the Lapla- 
cian. More precisely, they consider a regularising term 
of the form 

|Vu| dx + [3 [ f(\\7u\)(Au) 2 dx, 

where / must be a function with certain growth con- 
ditions at infinity in order to allow jumps. The well- 
posedness of the latter in one space dimension has 
been rigorously analysed by Dal Maso, Fonseca, Leoni 
and Morini [2D] via the method of relaxation. 

The idea of bounded Hessian regularisers was also 
considered by Lysaker et al. [35,36 , Chan et al. [TrJ] , 
Scherzer et al. [4*2"ll3"3"] . Lai at al. [33], and Bergounioux 
and Piffet [5] . In these works the considered model has 
the general form 



mm < ~||u 



\l 2 (S7) 



a\V 2 u\{Q) 



Further, in |40j minimisers of functionals which are 
regularised by the total variation of the (I— l)st deriva- 
tive, i.e., 



\DV l - L u\{f2), 

are studied. Another interesting higher order total vari- 
ation model is proposed by Bredies et al. [5] . The con- 
sidered regulariser is called total generalised variation 
(TGV) and is of the form 

TGV£{u) = sup { / udiv fc £ dx : 



CeC^Sym^)), 

||div^||oo <a t , Z = 0,...,fc-l}, (1.6) 

where Sym fe (R d ) denotes the space of symmetric ten- 
sors of order k with arguments in R d , and ai are fixed 
positive parameters. Its formulation for the solution 
of general inverse problems was given in [5] . 

Properties of higher order regularisers in terms of 
diffusion filters are further studied in [23] . Therein, the 
authors consider the Euler-Lagrange equations corre- 
sponding to minimisers of functionals of the general 
type 



J(u) 



(U - uf 




dx, 



for different non-quadratic penaliser functions /. More- 
over, Bertozzi and Greer [SJ have rigorously studied 
the fourth-order evolution equation which arises as a 
gradient flow of f G(Au), where G is a nondecreasing 



3 



function of quadratic growth in a neighbourhood of 
and at most linear growth at infinity. Solutions of this 
model arc called low curvature image simplificrs and 
are given by 

u t = — aZ\(arctan(Au)) + (uq — u). 

when G(s) = sarctan(s) — l/21og(s 2 + 1). 

1.3 Relation of our model to TGV, infimal 
convolution regularisation and higher order diffusion 
filters 

In this section we want to analyse the connection of 



our combined 1st and 2nd order approach (1.1) with 



infimal convolution ( 1.5 ) [13,44,45 and the generalised 



total variation regulariscr (1.6 1 of order two [5] 



In the case of inf-convolution (1.51 the regularised 
image u = u\ + u 2 consists of a function u% £ BV(f2) 
and a function u 2 £ BH(Q) which are balanced against 
each other by positive parameters a, /3. Differently, 



a minimiser u of (1.1) is in BH(S1) as a whole and 
as such is more regular than the infimal convolution 
minimiser which is a function in BV{fl). Hence, in- 
fimal convolution reproduced edges in an image as 
perfect jumps while in our combined total variation 
- bounded Hessian approach edges are lines where the 
image function has a large but finite gradient every- 



where. Our approach (1.1) can be made equivalent 



to infimal convolution if combined with the correct 
choice of adaptive regularisation, e.g., [24,29]. More 
precisely, we replace the two constant parameters a 
and (3 by spatially varying functions a(x),f3(x) and 
minimise for u 
1 



dx 



a Vu dx 



/3|V 2 u| dx. 



Then, we can choose a and j3 according to (1.5), i.e., 
a = where u = u 2 , P = where u = u\, and a/ '0 
correctly balancing u\ and u 2 in the rest of SI. 



The relation of ( 1.1 ) to the regularisation approach 



with total generalised variation [5] of order 2 can be 
understood through its equivalence with the modified 
infimal convolution approach [45] in the discrete set- 
ting. The total generalised variation of order 2 is de- 
fined for a positive multi-index a = (oto, a±) as 

TGV 2 {u) = sup I j u div 2 v dx : 

t-eC c 2 (/],Sym 2 (R' i )), 
||div i w|| L oo (fi) < at, 1 = 0, lj, 

where Sym 2 (R d ) is the space of symmetric tensors of 
order 2 with arguments in R d . For smooth enough 
functions u the second order TGV can be written as 



TGViiu) 



inf 

ui+u 3 =tl 
u\ ulcere-'- , w% £kcr 

\-0 j |Vu 2 +w 2 \ dx}, 



V Mi +UJ1I dx 



for a positive multiindex (a,/3) and e l — |(V«( + 
Vu/-). As such TGV 2 is equivalent to the modified 
infimal convolution regulariser proposed in [45] - 

Finally, the relation to higher order diffusion fil- 
ters as analysed in [33] becomes apparent when con- 



sidering the Euler-Lagrange equation of (1.1) in the 



case T = Id and /, g have the form f(x) = h(\x\), 
g[x) — h(\x\), where h is convex and has at most lin- 
ear growth. Namely, with appropriate boundary con- 
ditions we obtain the following Euler-Lagrange equa- 
tion 



u 



adiv h'(\Vu\ 



Vu 



/3div 2 h'{\V 2 u\) 



V 2 u 
\W 2 u\ 



This simplifies for the case h(s) — \/|s| 2 + e 2 to a 
regularised total variation/bounded Hessian flow that 
reads 



Vu 



u 



v/|Vu| 2 + e 2 



-/3div 2 



V 2 ? 



VW 2 



The consideration of corresponding evolution equa- 



tions for (1.7) in the presence of different choices of 



penalising functions h promises to give rise to addi- 
tional properties of this regularisation technique and 
is a matter of future research. 

For more discussion and comparison of higher or- 
der regularisers we recommend Chapters 4.1.5-4.1.7 
and 6.4 in [3j. 



2 Preliminaries 

Remarks on standard notation: As usual, we de- 
note with C n and H n the Lebesgue measure and the 
n-dimensional Hausdorff measure on K™. Different no- 
tions are denoted by | • | : When it is applied on vectors 
or matrices it denotes the Euclidean norm (vector) or 
the Frobenious norm (matrices) . When it is applied on 
measures it denotes the total variation measure while 
when it is applied on Borcl subsets of K n it denotes 
the Lebesgue measure of that subset. Finally, | • |i de- 
notes the £1 norm in K™ and (■, •) denotes the standard 
Euclidean inner product. 



2.1 Finite Radon measures 

All our notation and definitions are consistent with [2] . 
From now on, SI will denote an open set in E™ . We de- 
fine the space [M(S7)] m to be the space of K m -valued 
finite Radon measures. The total variation measure of 
(1 £ [M(f})] m is denoted by |^|. We say that a se- 
quence (/ife)fegN in [M(fi)] m converges weakly* to a 



4 



measure u G [A^(J7)] m if lim^oo f„ u dfik = fn u dfi 
for all u G Co(J7) which is the completion of C c (f2) 
endowed with the supremum norm. We will often con- 
sider the Lebesgue decomposition of a E m -valued fi- 
nite Radon measure v with respect to a cr-finite posi- 
tive Borel measure a : 



v = v a + v s 



where v a is the absolutely continuous part of v with 
respect to fi, v s is the singular part and the iyj\i) 
denotes the density function of v with respect to \x 
(Radon-Nikodym derivative). Recall also that any u G 
[M.{Q)] m is absolutely continuous with respect to \a\ 
and thus we get the polar decomposition of (j, 



nth 



= 1, \n\ a.e. 



2.2 Convex functions of measures 

Let j be a continuous function from K m to K which 
is positively homogeneous of degree 1, i.e., for every 
x e E m 

g(tx) = tg(x), Vi>0. 

Given a measure // G [,M(/2)] m , we define the R- 
valued measure g(/z) as follows: 



I Ml 



It can be proved that if g is a convex function then 
<?(•) is a convex function in [M.(Q)] m and if v is any 
positive measure such that u is absolutely continuous 
with respect to v then 



gip) = g (^) v. 



We refer the reader to Proposition | A . 1| in Appendix |A| 
for a proof of the above statement. Suppose now more 
generally that g is a continuous function from W n — > 
K which is convex and has at most linear growth at 
infinity, i.e., there exists a positive constant K such 
that 



\g(x)\<K(l + \x\), VieR" 1 . 

In that case the recession function g Q 
defined everywhere where 



of g is well 



5oo(a;) 



lim 



g(tx) - ,g(0) 



t 



Va; G 



It can be proved that g^ is a convex function and 
positively homogeneous of degree 1. Given a measure 
/i G {M({2)] m we consider the Lebesgue decomposi- 
tion with respect to the Lebesgue measure C n , fi = 



(fi/£ n )£ n + (I s and we define the 
g(n) as follows: 

9(M)=s(^)£"+5oc(M s ) 



fj 



g 



I/'' 



-valued measure 



(2.1) 



We refer the reader to Theorem |A.2| in Appendix [A] 
for a result regarding lower semicontinuity of convex 
functions of measures with respect to the weak* con- 
vergence. 

2.3 The space [BV{Q)] m 

We recall that a function u G L 1 (i7) is said to be a 
function of bounded variation or else u G BV{Q) if 
its distributional derivative can be represented by a 
K^-valued finite Radon measure. This means that 

di<j>dx = - [ <t>dD iU , Vcf> e Cl{Q), i = l,...,n. 

for some i?"-valued finite Radon measure Du — 
(Diu,...,D n ii).lt is immediate that W 1,1 {Q) C BV(Q) 
since if u G W 1 ' 1 ^) then Du = VuC n . Consistently, 
we say that a function u = (it 1 , . . . ,u m ) G [L 1 (i7)] m 
belongs to [BV{f2)) m if 



u a diS dx 



m 

a = 1, 



dD lU a , 



1. 



In that case Du is an ra x n matrix valued measure. 
A function u belongs to [BV(Sl)] m if and only if its 
variation in f2, V(u, £2) is finite, where, 



V(u, Q) = sup J V f u a 



div(b a dx 



<^G [cl{n)] mn , uu <1 . 



Moreover if u G [BF(i7)] m then \Du\{f2) = V{u,Q) 
and if u G [W 1,:1 (J?)] m , then |Dn|(fi) = J n \Vu\dx, 

where |Vu| = (£?=i E2=i(ft«°) 2 ) 1/a - The space 
[BI/(/2)] m endowed with the norm ||w||sv(i7) : — 
Jo |Du|(!?) is a Banach space. It can be shown 

that if Du = 0, then u is equal to a constant a.c. in 
any connected component of J?. 

Suppose that (u k ) k< z Nl u belong to [BV(Q)] m . We 
say that the sequence (uk)k<£N converges to u weakly* 
in [BV{Q)] m if it converges to u in [L 1 (Q)] m and the 
sequence of measures Du k converge weakly* to the 
measure Du. It is known that {uk)keN converges to u 
weakly* in [BV{Q)] m if and only if (uk)km is bounded 
in [BV{fi)] m and converges to u in [L 1 (J?)] m . The use- 
fulness of the introduction of the weak* convergence is 
revealed in the following compactness result: Suppose 
that the sequence {u k )k G N is bounded in [BV(il)] m , 
where Q is a bounded open set of K™ with Lipschitz 
boundary. Then there exists a subsequence (uk e )eefi 
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and a function u £ [BV(fl)] m such that (uk e )e&s con- 
verges to u weakly* in [BV(fl)] m . 

We say that the sequence (u/c)fcgN converges to u 
strictly in [BV{Q)] m if it converges to u in [L 1 {Q)] m 
and \Duk\(Q) converges to \Du\(fl). It is immediate 
that strict convergence implies weak* convergence. 

Suppose that I? is a bounded open subset of E™ 
with Lipschitz boundary. Let 1* = n/(n— 1) when n > 
1 and 1* = oo when n = 1. Then C L 1 * (fl) 

with continuous embedding. Moreover if fl is con- 
nected then the following inequality holds (Poincare- 
Wirtinger) : 

\\u-u n \\ L i* m <C\Du\(f!), Vu£BV(f2), 

where the constant C depends only on fl and uq de- 
notes the mean value of u in fl: 



1 

W\ 



u dx. 



We refer the reader to Appendix [B] for an introduc- 
tion to weak continuity and differentiabilty notions in 
BV(fl) as well as the decomposition of the distribu- 
tional derivative of a function u £ BV(fl). 



2.4 Relaxed functionals 

Suppose that X is a set endowed with some topology 
t and let F : X — > K U {+oo}. The relaxed functional 
or otherwise called the lower semicontinuous envelope 
of F with respect to the topology r is a functional 
F : X — > MU{+oo} defined as follows for every x £ X: 

F(x) = sup{G(a;) : G:I->RU {+oo}, 

r-lower semicontinuous, G(y) < F(y), 
Vy £ X}. 

It is easy to check that F is the greatest r-lower semi- 
continuous functional which is smaller or equal than 
F. It can be also checked that 

F(x) — sup inf F(y), 
ueAf(x) v^ u 

where Af(x) denotes the open neighbourhoods of x. 
Moreover, if X is a first countable topological space, 
then F(x) is characterised by the two following prop- 
erties: 

(i) For every sequence (x^^n converging to x, we 
have 



F(x) < liminf F(x k ). 

k— voo 



(ii) There exists a sequence (xk)kefi converging to x 
such that 



F(x) > limsupi^Xfc). 

k— >oo 



An interesting property of the relaxed functional is 
that if it has a minimum point then the value of F at 
that point will be equal with the infimum of F i.e. 

minfYa;) = inf F(x). 

xGX xex 

For more information on relaxed functionals see [19] 
and [7]. 



3 The variational formulation 

In the following fl denotes as usual a bounded, con- 
nected, open subset of M 2 with Lipschitz boundary, 
T denotes a bounded linear operator from L 2 (fl) to 
L 2 (fl), uq £ L 2 (fl) and a, j3 are non negative con- 
stants. Further we suppose that / : M 2 — > M. + , g : 
M 4 — > M + are convex functions with at most linear 
growth at infinity, i.e., there exist positive constants 



K\ and K2 such that 

f(x)<K 1 (l + \x\), Vx£ 
g(x)<K 2 (l + \x\), Vx£\ 



(3.1) 
(3.2) 



Moreover, we assume that both / and g are minimised 
in and they satisfy a coercivity condition: 



f(x)>K 3 \x\, Vx£R 2 , (3.3) 
g(x) > K 4 \x\, Vx £ M 4 , (3.4) 

where K 3 and K4 are strictly positive. We want to 
minimise the following functional: 

1 

2 



H(u) 



(u - T{u)) 2 dx + a f(Vu) dx 



-0 



/ 5(V 2 u) dx. 

J Q 



(3.5) 



The natural space for the functional H to be defined in 
is W 2,1 (fl). Since this space is not reflexive a solution 
of the minimisation problem by the direct method of 
calculus of variations does not work. Rather, existence 



of a minimiser of (3.5) can be shown via relaxation 
that is: : We extend the functional H into a larger 
space which has some useful compactness properties 
with respect to some topology and we identify the re- 
laxed functional with respect to the same topology. 
This space will be BH(fl). 



3.1 The space BH(fl) 

The space BH(fl) (often denoted with BV 2 {fl)) is 
the space of functions of bounded Hessian. It was intro- 
duced by Demengel in [5T] and consists of all functions 
u g W 1,1 (fl) whose distributional Hessian can be rep- 
resented by an R 2 x R 2 -valued finite Radon measure. 
In other words: 

BH(fl) = W 1 - 1 {fl) : Vu £ [BV(fl)} 2 }. 

We set D 2 u := D(Vu). Again it is immediate that 
W 2 ' x {fl) C BH(fl). BH(fl) is a Banach space equipped 



G 



with the norm ||u|| Bi? ( fi ) = |M|w(ft) + \D 2 u\(Q). If 
S7 has a Lipschitz boundary and it is connected then 
it can be shown that there exist positive constants C\ 
and C2 such that 

Vu(x)\ dx < C l \D 2 u\{Q)+C 2 I \u(x)\ dx, 
Vw £ BH(Q). 

Moreover, the embedding from BH(Q) into W 1,1 {Q) 
is compact, see [31]. We denote the approximate dif- 
ferential of Vw with V 2 u, see Appendix [b] for the def- 
inition. 

Analogously with BV(Q) we define the following 
notions of convergence in BH(f2): 

Definition 3.1 (Weak* convergence in BH(J1)). 

Let (uk)k<£N, u belong to BH{Q). We will say that Uk 
converges to u weakly* in BH(ft) if 

Uk — > u, in L 1 {Q) 

and 

Vitfc — 51 Vw weakly* in [BV{f2)\ 2 , as k — > 00, 
or in other words 

IK - "llz 1 ^) -> 0, 
yvtife - v-u|| [L i( f2 )]2 0, 

dD 2 u fe -> / dD 2 u, V0 G C (12). 
ft J n 

It is not hard to check that a basis for that topology 

consists of the following sets: 

U(v,F,e) = LeBH(V) : ||v - u\\ L x (n) 
+ \\Vv - Vu|| [i i (fi)] 2 

>i dD 2 v - / & d£> 2 u < e, 

ft y ft 

»€f j, 

where « G BH{Q), F is finite, e > and fa £ C ((7). 
This topology is also metrizable, hence first count- 
able. We don not imply here that BH{fi) is the dual 
space of a Banach space but we name this convergence 
weak* to show the correspondence with the weak* 
convergence in BV{fl). We have also the correspond- 
ing compactness result: 

Theorem 3.2 (Compactness in BH(f2)). Suppose 
that the sequence (uk)keN is bounded in BH(f2). Then 
there exists a subsequence (uk e )ieN and a function u £ 
BH(f2) such that Uk e converges to u weakly* in BH(fl). 

Proof. From the compact embedding of BH(S1) into 
W 1,1 {fi) and the fact that the sequence (Vuk)keN 
is bounded in [i?y(J7)] 2 we have that there exists a 
subsequence (uk^ieti, a function u £ W ' (f2) and 
a function v £ \BV{fl)] 2 such that Uk e converges to 



u in W 1 ' 1 ^) and Vuk t converges to v weakly* in 
[BV(tt)] 2 , as I goes to infinity. Then, Vu = v, u £ 
BH(Q) and Uk t converges to u weakly* in BH(fi). 

□ 

Definition 3.3 (Strict convergence in BH). Let 

(uk)keNj u belong to BH(ft). We say that Uf. con- 
verges to u strictly in BH(f2) if 

Uk — > u, in L 1 (i7) 

and 

\D 2 u k \(f2) -> \D 2 u\(f2), ask^(X). 
It is easily checked that the function 

d(u,v)= [ \u-v\ dx+\\D 2 u\(n)-\D 2 v\(n)\, 
Jn 

is a metric and induces the strict convergence in BH(S7). 
The following Lemma can be used to compare these 
two topologies. 

Lemma 3.4. Suppose that (ufe)fegN; u belong to BH(f2) 
and Uk converges to u strictly in BH(f2). Then 

\\uk - u\\ w i,i(Q) ->• 0, as k -> 00. 



Proof. We recall from (3.6) that there exist positive 



constants C\ and Ci such that 



f |Vu| dx < Ci\D 2 u\{Q)+C 2 I \u\dx, \/u£BH{Q). 
Jn Jn 

Since the sequence (uk)keN is strictly convergent in 
BH{Q), the sequences {\\u k \\L^{n))km and 
(\D 2 Uk\(f2)) k £N are bounded. Hence, there exists a 
positive constant C such that 



/ |Vwfe| dx < C, 
Jn 



Vfc G N, 



which implies that the sequence (uk)keN is bounded 
in BH(fl). From the compact embedding of BH(fl) 
into W 1,1 (S7) we get that there exists a subsequence 
(uk e )iefi and a function v £ W 1,1 (f2) such that Uk e 
converges to v in W l,1 {fl). In particular u k( converges 
to v in L 1 ([2) so v = u and thus Uk t converges to 
u in W ' (O). Since every subsequence of (uk)keN is 
bounded in BH(Q) we can repeat the same argument 
and deduce that for every subsequence of (itfc)fceN there 
exists a further subsequence which converges to u in 
W 1,1 {Q). This proves that the initial sequence {uk)keN 
converges to u in W 1,1 {fl) □ 

Corollary 3.5. Strict convergence implies weak* con- 
vergence in BH( f2). 
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3.2 Relaxation of the second order functional 

We now extend the functional H to BH(Q) in the 
following way: 



'hL( u o~ Tu ) 2 dx 

+a J n /(Vu) dx 

+/3 J n g(W 2 u) dx if tie W 2 ' 1 ^), 

+00 if / e 

BH{Q) \ W 2 ^(Q). 

(3.6) 



H ex (u) = < 



Indeed, as we have discussed above the weak* topology 
in BH(n) provides with a good compactness prop- 
erty which is inherited from the weak* topology in 
[BV(f2)] n . However, the functional H ex is not sequen- 
tially lower semicontinuous with respect to the strict 
topology in BH{Q) and hence it is neither with re- 
spect to the weak* topology in BH(Q). Indeed, we 
can find a function u € BH{Q) \ W 2 ' 1 (f2), see [H] for 
such an example. Hence, from the definition of H ex 
we have H ex (u) — 00. However, according to Theo- 
A.3 we can find a sequence (uk)keN m W 2,l (fi) 



rem 



which converges strictly to u. It follows that the se- 
quences {\\u k \\ L i( n) ) km , (\D 2 u k \(f2)) keN as well as 
( || Vufc H^i (r2) )fceN ar e bounded. Moreover the sequence 
(u k ) k £fi will be bounded in L 2 (f2). Since T is a bounded 
linear operator and from the fact that / and g have 
at most linear growth at infinity we deduce that the 
sequence H ex {u k ) is bounded as well. Hence we get 



H ex (Q) > liminf H ex (u k ), 

k— too 



which proves that H ex is not lower semicontinuous 
with respect to the strict topology in BH(S1). Thus, 
we have to identify its lower semicontinuous envelope 
with respect to the strict convergence. This is called 
the method of relaxation. We define the following func- 
tional in BH(fi) : 



H ex (u) 



1 



(uo — Tu) 2 dx + a /(Vu) dx 



+(3g(D 2 u)(n) 

- / (u — Tu) 2 dx + a /(Vu) dx 
2 Jn Jn 

+P [ ff(V 2 u) dx 
Jn 



where V 2 u, the approximate differential of Vw, is also 
the density of D 2 u with respect to the Lebesgue mea- 
sure, see Appendix [B] It is immediate to see that if 
u 6 W 2,1 {Q) then H ex (u) = H ex {u). Thus is general, 



H ex is smaller than H ex . 



Theorem 3.6. The junctional H ex is lower semicon- 
tinuous with respect to the strict topology in BH{Q). 



Proof. It is not hard to check that since / is convex 
and it has at most linear growth then it is Lipschitz, 
say with constant L > 0. Let u and (wfc)fceN be func- 
tions in BH(fi) and let u k converge to u strictly in 
BH(f2) and thus also weakly* in BH(fl). We have to 
show that 

H ex {u) < liminf H ex (u n ). 

k— ¥ OO 

From the definition of the weak* convergence in BH{f2) 
we have that u k converges to u in W 1 ' 1 (fi). From the 
Sobolev inequality, see [27] . 

IMMfl) < c\\v\\ W i,i {n) , vvew^in), 

we have that u k converges to u in L 2 (J7). Since T : 
L 2 (Q) — > L 2 (Q) is continuous then the map u n- 
\ Jn( Uc > — Tu) 2 dx is continuous and hence we have 
that 

- / (uo—Tu k ) 2 dx — > — / (ua~Tu) 2 dx, as k — > 00. 
2 J n 2 J n 

(3.7) 

Moreover since \\Vu k — ^u\\[ L i^ 2 )] 2 converges to as 
k —¥ 00, we have from the Lipschitz property 



/(Vu fc ) dx - / /(Vu) dx 
n Jn 



|/(V« fc )-/(V«)| dx< 



< 



L / \Wu k - Vu| dx -> 

J S2 

i.e., we have 
/(Vitfc) dx - 



as k 



/(Vu) dx, as k 



(3.8) 



Finally we have that D 2 u k 
apply Theorem A. 2 for fi k = 
D 2 u k and get that 



D 2 u weakly*. We can 
— C 2 , v — D 2 u, v k = 



g(D 2 u)(Sl) < liminf g(D 2 u k ){fl). 

k—too 



(3.9) 



From (3.7), (3.8) and (3.9 1 we get that 



Hex(u) < liminf H ex (u k )- 

k— >oo 



□ 



Theorem 3.7. The functional H ex is the lower semi- 
continuous envelope of H ex with respect to the strict 
convergence in BH(f2). 

Proof. Suppose that (uk)kefi converges to u strictly in 
BH(f2). From the lower semicontinuity olH ex we have 
that H ex (u) < liminf j. H ex (u k ) and since H ex < 
H ex we get that H ex {u) < liminf H ex {u k ). For 
the other direction Theorem |A.3| tells us that given u E 
BH{Q) there exist a sequence (u k ) k< z^ C C°°{Q) D 
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W 2 ' 1 (O) such that Uk converges to u strictly in BH{fl) 
and 



g{D 2 u k ){Q) ^ g(D 2 u){[2). 



(3.10) 



We have also that u k converges to u in W 1 ' (Q) which 
implies that 

„ / («o - Tu k fdx + a I /(Vu fc ) dx -> 



1 



(uq — T(u)) 2 dx + a f(Vu) dx, 



(3.11) 



Q 



as k — > oo, 



as the proof of Theorem 3.6 shows. From (3.10) and 



(3.11 ) we get that 



H ex (u) = lim H ex {u k ) 

k—>oo 



□ 



Remark 3.8. Observe that H ex is also the lower semi- 
continuous envelope of H ex with respect to the weak* 
convergence in BH{Q). 

The proof of the following minimisation theorem 
follows the proof of the corresponding theorem in [48] 
for the analogue first order functional. 

Theorem 3.9. Assuming T{X n ) ^ 0, a > 0, ft > 

then the minimisation problem 



inf H ex (u), 
has a solution u G BH(f2). 



(3.12) 



Proof. Let (u k )k<=N be a minimising sequence for ( |3.12 1 
and let C > be an upper bound for (H ex (uk))keN- 
We have that 

f f{Vu k ) dx<C and \f (u - Tu k ) 2 dx < C, 
J n 2 j n 

(3.13) 

for every k G N From the coercivity assumptions (|3.3[)- 



(3.4) and from (3.13) we have 



\Du k \{Q)= I \\7u k \dx<C, VfcGN, 
J a 



(3.14) 



for a possibly different constant C. We show that the 
sequence (ufc)fceN i s bounded in L 2 (Q), following es- 
sentially [H]. By the Poincare-Wirtinger inequality 
there exists a positive constant C± such that for every 
k G N 



INIUa(«) = 



u k - X n 



1 



\n 

< d\Du k \(Q) 



O 

u k dx 



u k dx 



L 2 (S7) 

u k dx 



<C 



u k dx 



Thus it suffices to bound \ f n u k dx\ uniformly in k. We 
have for every k G N 



T ( X n r^r / u k dx 



< 



T ( Xq-—- [ u k dx ] - Tu k 

u \ J J L 2 (n) 

+ \\Tuf, - uo|| L 2 (x - 2) + ||uq||x,3(«) < 

1 



ITU 



u k - Xq t^t / u k dx 



+ \\Tu k - u \\ L2 



L 2 (n) 



L 2 (17) 



+ I|uo||l2(^) < 

CiHTlllZ^fcK^ + v^C+lluoll^cn) < 
Ci||r||C + v^C+||tio|U»(n) :=C. 



It follows that 



u„ dx 



and thus 



|T(*. 



Ufe dx 



< 



n)\\L 2 (n) 



CI/21 



<C'|fl|, 



|T(X 



since T(Af(/2)) ^ 0. 

Since the sequence is bounded in L 2 (f2) and ]? is 
bounded, we have that the sequence is bounded in 
L 1 (/2) and moreover it is bounded in BH(Q). From 



Theorem 3.2 we obtain the existence of a subsequence 
(uk t )g&{ and u G BH{Q) such that Uk e converges to u 
weakly* in BH(Q). Since the functional H ex is lower 
semicontinuous with respect to this convergence we 
have: 



H ex (u) < liminf H ex (u n ) 

k— >oo 



which implies that 



u = min H ex (u). 
ueBH(n) 



□ 



Let us note here that in the above proof we needed 
a > 0, in order to get an a priori bound in the L 1 norm 
of the gradient (for (3 — see @S]). However, the proof 
goes through if a = and T is injective. If T is not 
injective and a = it is not straightforward how to 
get existence. The proof of the following theorem also 
follows the proof of the corresponding theorem for the 
first order analogue in [4"8] . 

Proposition 3.10. If T is injective or if f is strictly 
convex, then the solution of the minimisation problem 



3.12) is unique. 
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Proof. Using the Proposition |A.1| in Appendix [S] we 
can check that the functional H ex is convex. Let u\, 
u 2 be two minimisers. If T(u{) ^ T(u 2 ) then from the 
strict convexity of the first term of H ex we have 



inf H ex {u) 

ueBH(Q) 

which is impossible. Hence T(u\) — T(u 2 ) and if T 
is injective we have u\ = u 2 . If T is not injective but 
/ is strictly convex then we must have Viii = Vu 2 
otherwise we get the same contradiction as before. 
Since fl is connected there exists a constant c such 
that U\ — u 2 + cXq and since T(Xq) ^ , we get 
c= 0. □ 



4 Special cases and extensions 

In this section we introduce two more versions of the 
functional H ex , the anisotropic case and the case where 
the L 1 norm appears in the fidelity term. As we will see 
in the numerical experiments these models are more 
effective in specific cases. 



4.1 The anisotropic case 

We introduce the anisotropic version of the functional 



H in (3.5 1. Note that when f[x) = \x\ , g(x) 
then the relaxed functional H ex is given by 



(u - Tufdx + a \Vu\ dx + (3\D 2 u\f2, 



1 



Its anisotropic analogue is defined for f(x) = \x\\ and 
g(x) = \x\i. In that case, the relaxed functional is 
given by 



F ( u ) — \ [ ( u o~ Tu) 2 dx + a I ( u , 

+p{\D lUx \{Q) + \D 2 u x \(Q) 
+ \D 1 u y \(n) + \D 2 u y \(f2)). 



ly\) dX 



(4.1) 



As we will see in the numerical implementations the 
minimisation of the anisotropic version gives better 
results when the gradient of the image is paraller or 
vertical to the x-axis. Since the functional F is ob- 
tained for the above choice of / and g, the following 
theorem holds as a special case of Theorem |3.9| 



Theorem 4.1. Assuming T(Xq) ^ 0, the minimisa- 
tion problem 



inf F(u), 

■ieBH{C2) 



(4.2) 



has a solution. If T is injective then the solution is 
unique. 



4.2 L 1 fidelity term 

We consider here the case with the L 1 norm in the 
fidelity term, i.e. 

G(u) = I \u -Tu\ dx + a I |Vu| dx + P\D 2 u\{Q). 

(4.3) 

where for simplicity we consider the case fix) — |x|, 
g(x) — \x\. As it has been shown in 38] and also stud- 
ied in [M] and [25] , the L 1 norm in the fidelity term 
leads to efficient restorations of images that have been 
corrupted by impulse (speckle) noise. 

Theorem 4.2. Assuming T(Xq) ^ 0, the minimisa- 
tion problem 



inf G(u), 

u€BH(Q) 

has a solution. 



(4.4) 



Proof, the proof is another application of the direct 
method of calculus of variation. Similarly with the 



proof of Theorem 3.9 we show that any minimising 
sequence is bounded in L 1 (.!?). Hence it is bounded 
in BH(f2). Thus we can extract a weakly* conver- 
gent subsequence in BH(f2). Trivially, the functional 
is lower semicontinuous with respect to that conver- 
gence. □ 

Note that in this case the uniqueness of the min- 
imiser cannot be guaranteed since the functional G is 
not strictly convex anymore, even in the case where 
T = Id. The more general case of Theorem 3.9 can be 



easily extended to the cases discussed in Sections 4.1 
andSH 



5 The numerical implementation 

In this section we are working with the discretised ver- 
sion of the functional 13.51 and we describe the numeri- 
cal method we use in order to implement the minimi- 
sation, namely the split Bregman method. In partic- 
ular, in Section [5.1| we introduce the discrete versions 
of L 1 and L 2 norms as well as the discrete gradient 
and Hessian. In Section |5.2| we provide an introduc- 
tion to the Bregman iteration stating some of its im- 
portant properties and we describe how the Bregman 
iteration can be used for the solution of constrained 
optimisation problems, an idea originated in |39j . In 
[31] the Bregman iteration combined with a splitting 
technique (split Bregman) was used in order to solve 
the total variation minimisation. In the latter paper 
it was also proved that the iterates of the Bregman 
iteration converge to the solution of the constrained 
problem assuming that the iterates satisfy the con- 
straint in a finite number of iterations. We give a more 
general convergence result where we do not use that 
assumption, see Theorem |5.5| Finally in Section |5.3| 
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we describe how our problem can be solved with the 
Bregman iteration, using the splitting procedure men- 
tioned above. 



5.1 The discretised setting 

In this section we study the discretisation and minimi- 
sation of the functional 13.51 In our numerical exam- 
ples we consider two different choices for / and g. The 
first one is isotropic regularisation with f(x) — \x\, 
g(x) — \x\ that is to say when 



Hex{u) = -l {u Q -Tu) 2 dx+a |Vu| dx+/3\D 2 u\(0). 
2 J a Jn 

(5.1) 

The second choice is anisotropic regularisation with 
f(x) = \x\i, g(x) — \x\i. In that case the functional 
has the form 

H ex (u) = - I («o - Tufdx + a I (\u x \ + \u y \) dx 
1 Jn Jn 



-p{\D lUx \{D) + \D 2 u x \{D) 

-\D lUy \(n) + \D 2Uy \(n)). 



(5.2) 



Moreover, the data fidelity term is either measured in 
the L 2 or in the L 1 norm. For these choices we have 
to define a couple of discrete operators and norms. We 



will denote the discretised version of (5.1) with J. In 



the discretised setting, u will be an element of IR nxm , 
T will be a bounded linear operator from ]R nxm to 
M nxm . Moreover, for f(x) = \x\ and g(x) = \x\, the 
discrete functional J is given by 

J(u) = i||«o - Tu\\\ + a||Vu||i + /3||V 2 n||i, (5.3) 



where for every u £ M" xm , for every v — (vi,v 2 ) € 



and for every w — (wi , 7«2, 7^3, W4) € 



we define 



1/2 



ll«l| 2 := 



v 2 := 



w 2 



EE^) : 

n m 

n m 
V i=l j=l 

\ 1/2 

+ w 3 (i,j) 2 + w 4 (i,j) 2 ) 



1/2 



i=i j=i 



i* : = EE Mm) 2 j) a ) 
i=i j=i 



2\l/2 



ink := EE ( w ^ i ^) 2 + w 2(hj) 2 
1=1 3=1 

+ w 3 (hj) 2 + Wi(i,j) 2 ) 1/2 . 

We follow [T^] and [5] for the formulation of the dis- 
crete gradient and Hessian of u £ E™ xm . That is, 
we define V and div consistently with the continuous 
setting as adjoint operators. The same for the Hes- 
sian H and its adjoint divH. We have that Vit = 
((Vu)i, (Vu) 2 ), where 

u(i,j + 1) — u(i,j) if 1 < i < n, 



(V«)i(m1 = < 



(V«) 2 (i,i) = < 



1 < j < m, 

if 1 < i < n, 

j = TO. 

u(i, i + 1) — u(i, j) if 1 < i < n, 

1 < j < TO, 

if £ = n, 

1 < j < TO. 



and V 2 u = ffu = ((!Tu)ii,(iru)ia,(Hu)a 1 ,(iru)a2), 
where 



u(i, j + 1) — 2u(i, j) if 1 < 7 < n, 
+it(i,j — 1) 1 < j < m, 

u(i, j + 1) — u(i,j) if 1 < i < 71, 

i = i, 

i - 1) - i) if 1 < i < 



{Hu) u {i,j) = < 



} 4 (ff«)2a($,i)= < 



(Hu) 12 (i,j) = < 



(Hu) 21 (i,j) = < 



.7 = m - 

w(i + 1, j) — 2u(i,j) if 1 < i < n, 
+u(i-l,j) l<j<m, 
u(i + 1, 1) — u(i,j) if i = 1, 

1 < j < TO, 
if i = n, 

1 < J < TO. 

u(i + — u(£ + 1, j — 1) if 1 < i < 7i, 
-u(i,j) +u(i,j - 1) Kj<m, 
if i = n, 

1 < j < TO, 

if 1 < 7 < 77, 

J = l. 

7i(7, j + 1) — 7_t(7 — 1, j + 1) if 1 < 7 < 77, 
-U(i,j) - 1, j) l<j<777, 

if i = 1, 

1 < j < TO, 
if 1 < 7 < 77, 

j = TO. 
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The elements (Hu)n, (H1122), {Hu)\2, {Hu)2\ de- 
note the discretised second derivatives u xx , u yy , u xy , 
u yx respectively. 

We also need to define the discrete divergence and 
second divergence operators, div and divH, that have 
the properties: 



div : (m™ xm y 



div(u) • u = v ■ V«, 



with 



Vw G 



v e (R nxm Y 



divH 



with 



divH(w) • u — w ■ Hu, Vu G 



w G 



where the dot denotes the Euclidean inner product, if 
we consider u, v and w as large vectors formed succes- 
sively by the columns of the corresponding matrices. 
We refer the reader to Appendix [C] for the exact for- 
mulation of these divergence operators. 



5.2 Constrained optimization and the Bregman 
iteration 

Suppose we have to solve the following constrained 
minimisation problem: 



min E(u) such that Au = b. 



(5.4) 



where the function E is convex and A is a linear map 
We transform the constrained minimi- 



from ' 



to 



sation problem (5.4 1 into an unconstrained one, in- 



troducing a parameter A. In order to satisfy the con- 
straint we have to let A go to infinity: 



sup min E(u) 

x uem d 



A 



\\Au- b\ 



(5.5) 



Instead of doing that we perform the Bregman itera- 
tion as it was proposed in [59] and [53]: 



Bregman Iteration 



Uk+i = min E(u) + '-\\Au - 6 fe |||, 



2 



b k+1 =b k +b- Au k+1 . 



(5.6) 
(5.7) 



In this section, we give a short description of the 
Bregman iteration together with some known results 
as these were presented in [3"l"ll3"9"Il5"4"] . We show that 

con- 



under certain conditions, the iterates u k in 5.6 



verge to the solution of the constrained minimisation 
problem |5.4| In Section |5.3| we describe how we can 
transform the unconstrained minimisation problem |5.3| 
min u J(u) into a constrained one and then solve it 
with the Bregman iteration using a splitting tech- 
nique. 



We start with some basic definitions. Recall that 
if E is a convex function from K d to M then the subd- 
ifferential of E evaluated at v G R, denoted by dE[v], 
is the following set: 

dE[v] := {p G R d : E{u) > E(v)+(p, u-v), Vu G R d }. 

The Bregman distance associated with the convex func- 
tion E at the point v is defined as 

D P E (u, v) := E(u) - E(v) -(p,u- v) 

where p belong to the subdifferential of E evaluated at 
v. From the definition D p E (u : v) > and if E is strictly 
convex then D E (u,v) — if and only if u = v. In 
general D E (u, v) ^ D E (v, u) so the Bregman distance 
is not a metric. However, the Bregman distance does 
a carry a notion of distance since for every w of the 
form w = \u + (1 — X)v, A G [0, 1] we have D E (w, v) < 
D p E (u,v). 

For notational convenience we set H{u) :— \\Au — 
The original form for Bregman iteration as it was 
proposed in [10] reads: 

arg min D V E (u, Uf~) 



u k +i 



H{u) 



(5.8) 

ueK d 

Pk+i =Pk - VH(u k+ i). (5.9) 

The following theorem shows that iterates are well de- 
fined in the sense that p k always belongs to the sub- 
differential of E evaluated at some point, except per- 
haps for k — 0. It is a simple modification in the finite 
dimensional setting of the corresponding proposition 
in |39j where we refer the reader to for a proof. In 
particular, in 39J, they work in the continuous set- 
ting and consider the case where E(u) = \Du\(f2) and 
H(u) = |||/ - Ku\\, for / G L 2 and K a bounded lin- 



ear operator. As we will see in Section 5.3 in our case E 



will be essentially equal to J. Notice that J is strictly 
convex when T is injective. 

Theorem 5.1. Suppose that E is a strictly convex 
functional from M. d to R. Moreover, assume that E 
is coercive i.e. there exist constants C > and D 
such that E(u) > C\\u\\ 2 + D for all u G R d . Finally 
let Uq = and po = 0. Then for every k G N, the 



solution to the minimisation problem {5.8) is unique 
and p k G dE[u k ] for k = 1,2,.... 

The following auxiliary theorem states some mono- 
tonicity properties of the Bregman iteration. Again see 
[55] for a proof. 

Theorem 5.2. Suppose that E is a convex functional 
and consider the algorithm (5.8)-(5.9) with the initial- 



isation u a — and p = 0. Suppose that the solutions 



u k to the minimisation problems (5.8) exist. Then we 
have: 



H(u k ) < H(u k ) 
fc = 2,3,.... 



D E k '(tt^tife-i) < H(u k -i), 



E 

H(u) 
Vu G 1 



H(u k ) < 



E (u,U k -i), 

• > fe = 1) 2, . . . . 



(5.10) 



(5.11) 
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In particular if u is a minimiser of H we get 
D p E k («, u fc ) < B"'- 1 {u, u k -i), k = 2, 3, . . . . 



From (5.151 and (5.9) and the fact that Au = b, we 
get for fc > 2, 

fc 



The next theorem reveals more information about E{u) > -D B ("i""" i)+^(0) 



the convergence of the algorithm ( 5 . 8 1- ( 5 . 9 ) . The proof 
that we provide is a simple adjustment in the finite di- 
mensional setting of the corresponding proof in [39] . 
Moreover, our version includes a further remark, in- 
equality (5.13), that we are going to use later. 

Theorem 5.3. Suppose that E is a convex functional. 
Consider the algorithm (5.8\ l-{ 5.9 ) with initial values 
uq = and po = 0. Suppose that the solutions Uk 
to the minimisation problems (5.8) exist. Then for a 
minimiser u of H and for every k = 2, 3, . . . we have: 



= E{u k ) - ^2(p v -i,u v - u v -\) 



v=l 



E(u k ) - (p k -i,u k - u) 



k-l 



H{u k ) < H{u) 
and in fact 

oo 

J2(H(u k ) - H(u)) < 

k=l 



E{u) - E{ui) 
jfe^l : 



(5.12) 



(5.13) 



which tells us that (u k )keN is a minimising sequence 
for H. Moreover, there exists a positive constant C 
such that 

E{u k ) <C, Vfc e N. (5.14) 

Proof. By taking the sum y 1 _^ in (5.11 1 we get 

fc 

D p E k {u, u k ) + J2 ( d e^ (u»,u„-x) + H(u v ) - H(u)) < 



k-l 

E(u k ) + Y,{VH(u v ),u k -u) 
k-l 
u=l 

k-l 

E(u k ) + 2 J2{A T (Au„ -b),u k -u) 

v=l 

k-l 

-2Y^{A T {Au v - b),u p - u) 

v=l 

k-l 

E(u k ) + 2^2(Au v - 6, Au k - b) 

v=l 

k-l 

-2^11^-6111 



k-l 



D° E (u,u ) = E(u)-E(0), 
and hence 



(5.15) 



> E(u k ) - (fc - 1)|| Au fc - b\\ 2 2 - 3^ \\Au v - b\ 

v=l 

> E(u k ) - A{E{u) - E( Ul )) - 3H(ui). 
This is because 

k-l 



2^2(Au„-b,Au k -b) > 



v=l 
k-l 



D p E k {u, Ufc ) + W K< u "-i) + H ( u ») - H &)) - E W Au « - b Wi-( k - l )\\ Au k - H 



v=l 



< D° E (u,u ) = E(u) - E{ui) > 0. 



(5-16) Hence, dlTTl follows. 



□ 



From the non-negativity of the term D P E 1 (u v , u v -\) 
and the monotonicity of H(u v ) we get 

D Pk (fi, u fc )+(fc-l) {H(u k ) ~ H(u)) < E(u)-E( Ul ) => 

(fc - 1) (H(u k ) - H(u)) < E{u) - E( Ul ), 



Remark: The above theorem tells us that the quanti- 
ties E{u k ) + H(u k ) are uniformly bounded in fc. From 
now on we assume that H has the form H(u) = 
(\/2)\\Au — 6|||, where A is a positive constant. Note 
that all the previous theorems hold. In that case the 
algorithm (|5.8|)-(|5.9|) now reads: 



which proves (5.12). In fact from (5.16) we also get 
that 

fc 

^2(H(u v ) - H(u)) < E(u) -E( Ul ), VfceN 



Uk+i = arg min E{u) - (p k , u - u k ) + ~\\Au - b\\ 



(5.17) 



v=1 



thus 



5.13) follows. Recall now that H(u) — \\Au—b\ 



Then (5.12) implies for Au — b that 

fc 

(fc - l)\\Au k - bf 2 < Uu v -b\\%< E{u) - E{u x ). 



Pk+i =Pk~ XA T {Au k+1 - b), (5.18) 

where we have ignored the constant E(0) in the ob- 
jective function in (5.17). We are going to state an 
equivalent formulation of the above iteration that was 
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investigated in [53]. We have the following two ver- 
sions of the algorithm first of which is the algorithm 



(5.17)-(5.18) 



Algorithm 1: 

it = 0, po = 0, 

U fe+ i = arg min E(u) - (p k , u - u k ) 

u£R d 



+^\\Au-b\ 



Pk+i =Pk~ XA T (Au k+1 - b). 

Algorithm 2: 

it = 0, b a = 0, 
bk+i =b k + b- Au kl 

Uk+i = arg min E(u) + -\\Au - b k 



(5.19) 

(5.20) 
(5.21) 

(5.22) 
(5.23) 

(5.24) 



The following theorem from [53] shows that the two 
formulations are equivalent. 

Theorem 5.4. Suppose that the minimisation prob- 
lem (5.20) has a unique solution for every k. Then the 



Algorithm 1 {5.1^5.21^ and the Algorithm 2 \5.22 



5.24-) are equivalent in the sense that the minimisation 



problems (5.20) and (5.24-) have the same objective 



functions that might differ up to a constant. 

The following theorem was proved in |31j in the 
case where the iterates satisfy the constraint in a fi- 
nite number of iterations. Here we give a more general 
proof where we do not use that assumption. 

Theorem 5.5. Suppose that the constrained minimi- 
sation problem (5.4) has a unique solution u* . More- 



over, suppose that the convex functional E is posi- 
tive and coercive. Finally suppose that the Theorem 
\5.4\ holds as well. Then the sequence of the iterates 
(u k ) k £fi of Bregman iteration (i.e. given by Algorithm 
1 or 2) converges to it*. 

Proof. From the fact that Theorem pT4] holds and from 



( |5.12[ )-( |5.13[ ) we get that, for some M > 
M 



||i4tt fc -6|| a < 



oo 

and \\Au k - b\\\ < oo. 



\lk > 1 



(5.25) 



fc=i 



Moreover we have (po — 0) that 



2- 



i/=l 



From the Theorem |5.3| and the coercivity of E we get 
that the sequence || life || 2 is bounded, say by a constant 
C . Thus it suffices to show that every accumulation 
point of that sequence is equal to u* . Since Au* = 



b, we have for every increasing sequence of naturals 

\\\Au* - b kl \\l < ^ (|| An* - Au ke \\ 2 + \\Au ke - b ke \\ 2 f 
A 2 

= 2 (W Auk e ~ b h + \\Auk e ~ h e h) 

AM 

< 



2(k ne - 1) 
+X\\Au ki - b\\ 2 \\Au ke - b kf \\ 2 



A 



Au ke 



and because of the fact that 



E(u ke ) + ^\\Au kf - b kt \\l < E(u*) + ^\\Au* 



we get that 
E(u kt ) < E(u*) + 



AM 
2(k e - 1) 
+X\\Au k( - b\\ 2 \\Au ke - b ke \\ 2 
AM AM 



< E(u 
+A|| 

< E(u*) 



2{k t -l) ^fki^l- 
+\\\Au kl - b\\ 2 \\b kl , || 2 

AM ; A||A||C 
2(k e - 1) + 



\Au k . 



-\\\Au ke ~ b\\ 2 J2\\Au v - b\\ 



Suppose now that u kl converges to some u as k goes to 
infinity then u also satisfies ||^4u — b\\ 2 — 0. Then if we 
take limits because of Kronecker's lemma, see Lemma 



D.l the right hand side limit is E(u*). We have 



E(u) < E(u*) 

and since u* is the solution of the constrained minimi- 
sation problem we have u — u* . We conclude that the 
whole sequence {u k ) k ^ converges to u* . □ 



5.3 Numerical solution of our minimisation problem 

In this section, we explain how the Bregman itera- 
tion together with an operator splitting technique can 
be used to implement numerically the minimisation of 
our functional. The idea originates from [31] where it is 
called Split Bregman algorithm. Let us mention here, 
however, that this iterative technique is equivalent to 
certain instances of combinations of the augmented 
Lagrangian method with classical operator splitting 
such as Douglas- Rachford, see [43] . We also refer the 
reader to the preprint of Burger, Sawatzky, Brune, 
Benning [4] for applications of Bregman methods to 
higher-order regularisation models for image recon- 
struction. 

Exemplarily, we present the resulting algorithm for 
the minimisation of J in (5.3), i.e., for f(x) = \x\, 
g{x) = \x\ and an L 2 data fidelity term. The other 



instances of the general higher order model (3.5| that 
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we consider in Sections |6j [7] and [8] are solved with the 
same technique, which result into very similar algo- 
rithms as the one described in the sequel. 

Recall that we want to solve the following uncon- 
strained minimisation problem: 



mm - u 



Tu\\l +a||Vu||i +/3||V 2 u||i (5.26) 



The derivation of the Split Bregman algorithm for 



solving (5.26) starts with the first important obser- 



vation that this above minimisation problem is equiv- 
alent to the following constrained minimisation prob- 
lem: 



1 



mm 

«e(i"") 2 



uo-r«||! + aH|i + /3H 



(5.27) 



such that v = Vu, w = V 2 it. 

For every u £ R nxm we denote by col(u) the nm— th 
column vector formed successively by the columns of 
u. With this notation we introduce the column vector 



UJ 



(col(u), col(vi), C0l[V2), 

col(wu), col(w 22 ),col(wi 2 ), col(w 2 i)) , 



which has total length 7mn — d. Since the gradient 
and the Hessian are linear operations it is clear that 



the minimisation problem (5.27) can be reformulated 
into the following problem: 



min E(oj) such that Auj = 6, 



(5.29) 



It follows that b is the zero vector. Recall that for 



constrained minimisation problems of the type (5.27) 



the corresponding Bregman iterative scheme will be 
A 



u k +i = mm 



\Au - 6 fe || 2 , 



bk+i = b k + b- Au k+1 . 



(5.30) 
(5.31) 



It is easy to see that the iterative scheme of the 
type ( 5.30|5.31 ) that corresponds to the constraint 
minimisation problem (5.27) will be : 



(u 



k+ \v k+ \w k+1 ^ - 



) = argmin -||w - Tu\\ 2 + a|H|i 

u.v.w ** 



b k+1 

h k+l 



+P\\w 

A 
+ 2 
b\ H 



A 



b k 



2 

b k 2 + V 2 u 

Vu k+1 - 
VV +1 - 



b k + Vu-v\ 



2- 



,fc+l 



,M+1 



(5.32) 

(5.33) 
(5.34) 



where b 1 



k+l 



(bii\b k + 1 ) £ (I 



pnxm\2 



) 2 and b k+1 



/.fe+1 .fe+1 ,k+l l,k+l\ G /Ti 
V"2,ll' u 2,22' "2.12J "2,21 7 t V 11 



Remark 5.6. Notice that at least in the case where 
T is the identity junction, the minimisation problem 



5.32) has a unique solution. Moreover, the functional 



E is positive and coercive and the constrained minimi- 



(5.28) sation problem (5.21) has a unique solution. Thus, the 



iterative scheme (5.30)-(5.31) is indeed equivalent to 

I holds. 



the Bregman iteration and Theorem\57, 

Our next concern is the efficient numerical solu- 



tion of the minimisation problem (5.32). We follow 



[31] and iteratively minimise with respect to u, v and 
w alternatingly: 



where E : E d — > E + is convex, A is a d x d matrix and 
b is a vector of length d. For completeness, we describe 
the form of E and A. Define the following projections 
P u , P v and P w acting on ui such that if u> has the form 



(5.28) then 



P u (u) = {col{u)), 

P v {ui) = (col(vi),col(v 2 )), 

P w {u) = {col(wii),col(w 22 ),col(wi 2 ),col(w 2 i)). 
Then we set 

E{u) = \\\col{u Q ) ~ T{P u {u))\\l + aWPvMh 



Vw £ 



where T is seen now as a linear operator from 



to 

M. nm . As far as the constraints in 5.27 are concerned, 
these can be written as follows: 



Auj 




Split Bregman for Isotropic TV BH - L 2 
Standard Splitting 



6? = 0, b° 2 =0, 



,k+i 



,/c+l 



k+l 



k+l 

k+l 
'2 



argmin -\\u - Tu\\ 2 



\bt + Vu- v k \ 



+ ^\\b k +\7 2 u-w k \\l 
argmin a||i;||i 

uG(K™ xm ) 2 



A 



bt+Vu k+1 -v\\i, 



argmin /3||w||i 

oG(R"X m ) 4 

A 
f 2 



b k +V 2 u k +i-w\\l 



b k + Vu k+1 -v k +\ 



2 „, k+l 



k+l 



(5.35) 



(5.36) 



(5.37) 



(5.38) 

(5.39) 
(5.40) 
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The above alternating minimisation scheme, make up 
the split Bregman iteration that was proposed in |31j 
to solve the total variation minimisation problem as 
well as problems related to compressed sensing. For 
convergence properties of the split Bregman iteration 
and also other splitting techniques we refer the reader 
to [3511251 1151 . In g5J the split Bregman method is 
proved to be equivalent with the so-called Douglas- 
Rachford algorithm and its convergence is guaranteed. 



The first of the above minimisation problems ( 5.36 ), 



can be solved through its optimality condition. This 
condition reads: 

T*Tu - Adiv (Vu) + AdivH (Hu) = 
T*(u ) + Xdiv((b k -v k )) 

-AdivH ((&§ - w k )) , (5.41) 
where T* denotes the adjoint of the discrete operator 



T. Since all the operators that appear in (5.41) are 



linear, it leads to linear system of equations with nm 
unknowns. 



The solution of the minimisation problems (5.37) 



and (5.38) can be obtained exactly through a gener- 



alised shrinkage method. It was used in both [31j and 
[50J . It is a simple computation to check that if a e R" 
then the solution to the following problem: 



A 



mm \\x\ 



(5.42) 



can be obtained through the following formula: 



Si (a) 



| 2 -io 



I a 1 1 2 



(5.43) 



Each of the objective functionals in (5.37) and (5.38) 



can be written as a sum of functionals of the same 
type in (5.42) where n = 2, 4 respectively. Thus the 



solution to the problems (5.37 ) and ( 5.38 1 can be com- 
puted as follows: 



,,fc+i 



fc+i 



fVu fe+1 (i,j)), 



w 



k+l, 



w 



k+1 'iJ),wl +1 (i,j), 



Sf $(i,j) + V 2 u fc+1 (i,j)) 



(5.45) 



for i = 1, . . . , n and j — 1, . . . , m. 

Let us note here that another formulation alter- 



native to (5.27) is possible. In particular the problem 



(5.26) is also equivalent with the following: 

min -||u -Ttt||| + a||'w||i+^||u;||i, (5.46) 

u£l" xm * 

«,e(R" xm ) 4 

such that v — Vu, v = v, w = VS. 

Working as previously we can derive the following 
splitting technique for the minimisation of |5.46| 



Split Bregman for Isotropic TV — BH — 1/ 
Alternative Splitting 



,/c+l 



~fe+l _ 



argmin -\\u - Tu\\ 2 

ue %nXm Z 

+ ^\b1+Vu~v k \\l 
argmin — I|&x,l + 



(5.47) 



villi 



- X \\b k 



2.1 



v l\\ 2 



+ 2 life' & 3,i 2 ) + V«i - K\,Oll^ (5.48) 



argmin -\\b k 2 + V 2 u k+1 

+hb k ' -•* 112 



v 2 \\i 



2 
A 



5 2,2 + V 2 -V2\\ 2 



+ '^\\(bl 21 ,bl 22 ) + Vv 2 ^(w k 1 ,w k 2 )\\ 2 2 (5.49) 



k+l 



argmm Q!||u||i 

l)£(E«'") 2 



Js+l _ 



argmin /3\\w\\ 



-~\\b k 



~k+l 



b k+1 =b1 + Vu k+l - v 



Vw 



fe+i „-,fe+i 



12! 



b k+1 

h k+l 

°3 



~fe+l 



V 



b k + Vi> k+1 -w k+1 . 



(5.50) 



(5.51) 

(5.52) 
(5.53) 
(5.54) 



Notice that the above splitting leads to three linear 
system of equations for the optimality conditions of 



(5.47), (5.48) and (5.49) instead of one for the stan- 



dard splitting. However the matrices that correspond 
to these systems are sparser than the one that corre- 



(5.44) sponds to the optimality condition of ( 5.36 1 . As Table 



[T] shows, the standard splitting is a bit faster but we 
believe that there is room for further optimization of 
our code in order to explore this extra sparsity. 

We have performed numerical experiments in de- 
noising, deblurring and inpainting. In all our compu- 
tations we used A = 1. Let us note here that in all of 
our numerical examples the range of image values is 
[0, 1] (zero for black color and one for white). 

6 Applications in Denoising 

In this section we discuss the application of the TV- 



BH approach ( 5.3 ) to image denoising. We show exper- 



iments for different levels and kinds of noise (Gaussian 
and impulse noise) and with various combinations of 
a and f3. Note that for f3 = 0, our model corresponds 
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to the classical Rudin-Oshcr-Fatemi denoising model. 



The operator T in (5.3) equals the identity. Our basic 



non-real world test image can be seen in Figure [3j 




Fig. 3 Main test image. Resolution: 200x 300 pixels 



Our main assessment for the quality of the recon- 
struction is the structural similarity index SSIM 52, 
[5Tj . The reason for that choice is that in contrast with 
traditional quality measures like the peak signal-to- 
noise ratio PSNR, the SSIM index also assesses the 
conservation of the structural information of the re- 
constructed image. Note that the perfect reconstruc- 
tion would have SSIM value equal to 1. A justification 
for the choice of SSIM as a good fidelity measure in- 
stead of the traditional PSNR can be seen in Figure 
|4j Both images are results from denoising with the 
first order method (f3 = 0, noise: Gaussian, Variance 
= 0.1). The left picture is the one with the highest 
SSIM value (0.6595) while the SSIM value of the right 
picture is significantly lower (0.4955). This assessment 
comes into an agreement with the human point of view 
since, even though this is subjective, one would con- 
sider the left picture as a better reconstruction. On the 
other hand the left picture has slightly smaller PSNR 
value (14.02) than the right one (14.63), which was 
the highest PSNR value. Similar results are obtained 
for $ ^ 0. 

A stopping criterium for our algorithm we used a 
predefined number of iterations. In most examples this 
number was 300. We observed that after 80-100 iter- 
ations the relative residual of the iterates was of the 
order of 10~ 3 or lower (see also Table [l| and hence 
no noticeable change in the iterative solutions was ob- 
served after that. 

In the following we shall examine wether the in- 
troduction of the higher order term (/3 ^ 0) in the 
denoising procedure, produces results of higher qual- 
ity. 

6.1 Isotropic denoising with L 2 fidelity term 
We start with considering the minimisation of 
J(u) = -|K - ug + a\\Vu\\i + /3||V 2 k||i, 



which we use as an isotropic higher order TV model 
to eliminate Gaussian noise in the given image uq. 
The noise has been produced with MATLAB's built 
in function imnoise. We compare the first order with 
the second order method, by showing the image that 
has the highest SSIM value among those that were 
recovered with j3 = and j3 ^ respectively. Note, 
however, that the optimal /? value in terms of SSIM 
does not always correspond to the /3 value which visu- 
ally reduces the staircasing effect best. The latter is in 
general slightly bigger than the one chosen by SSIM, 
see Figure [6] Still, for proof of concept, we prefer to 
stick with an objective quality measure and SSIM, in 
our opinion, is the most reliable choice for that matter. 

Figure [5] depicts one denoising example, where the 
original image is corrupted with Gaussian noise of 
variance 0.005. The highest SSIM value for the isotropic 
TV-denoising is achieved for a = 0.12 (SSIM=0.8979) 
while the highest one for the second order TV method 
is achieve d for a = 0.06, (3 = 0.03 (SSIM=0.9081). 
In Figure |5(b)| we have plotted the middle row slices 
of the images (original, corrupted and restored) . The 
reduction of the staircasing effect is clearly visible. 
Let us note that for larger values of P the decrease 
of the staircasing effect is even more noticeable but 
the solutions become slightly more blurry. In Figure 
[6] we have plotted again the image with the highest 
SSIM, a = 0.06, (3 = 0.03 (SSIM=0.9081) and the re- 
stored image that corresponds to a — 0.06, p — 0.06 
(SSIM=0.8989). The second image is still pleasant to 
the human eye, despite being slightly more blurry, and 
the staircaising effect has almost disappeared. 

Next we check how the SSIM and PSNR values 
of the restored images behave as a function of the 
weighting parameters a and p. In Figure [7] we plot 
the results for a = 0, 0.02, 0.04, . . . , 0.3 and P = 
0, 0.005, 0.01, . . . , 0.1. The plots suggest that both qual- 
ity measures behave in a continuous way and that they 
have a global maximum. However, PSNR tends to rate 
higher those images that have been processed with a 
small value of P or even with P — which is not the 
case for SSIM. An explanation for that is that higher 
values of p result to a further loss of contrast, see Fig- 
ure [HJ something that is penalised by the PSNR. The 
SSIM index penalises the loss of contrast as well but 
it also penalises the creation of the staircasing effect. 
This is another indication for the suitability of SSIM 
over PNSR. Note also, that the contrast can be re- 
covered easily in a post-processing stage while it is 
not an easy task to reduce the staircasing effect using 
conventional processing tools. 

We have also considered numerical experiments with 
higher noise levels. In Figure [9] the given image is 
corrupted with Gaussian noise of variance 0.05. In 
that case the highest value of SSIM for higher or- 
der denoising is achieved for a = 0.3 and P = 0.05, 
SSIM=0.8226, while the highest value for TV denois- 
ing is achieved for a = 0.2, SSIM=0.8162. Note that 
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Fig. 4 Justification for the usage of SSIM index. Best SSIM value (1.) (SSIM=0.6595, PSNR=14.02) and best PSNR value 
(r.) (SSIM=0.4955, PSNR=14.63) among reconstructed images with the first order method (/3 = 0). The initial image is 
contaminated with Gaussian noise of variance 0.5. The better SSIM assessment of the first image agrees more with the 
human perception 



for every fixed value of a the introduction of the higher 
order term is always increasing the SSIM value. How- 
ever, for higher noise levels we observe in Figure [9 (b)| 
only a small improvement of the staircasing effect us- 
ing the higher order denoising. Even though we see 
some improvement with the introduction of the sec- 
ond order term, it can not eliminate the artifacts in 
the restored image when the original image is heavily 
corrupted with noise. 



6.2 Anisotropic denoising with L 2 fidelity term 

In anisotropic denoising, our task is to implement the 
following minimisation problem: 



min dK-u|| 2 a(||(Vu)i||i 

^(\\(H U ) n \\ 1 + ||(ITu)22||l 

+ ||(i*tt)ia||i + ||(Ftt) ai ||i) 



l(V«) 



Analogously with the isotropic case, we use a Split 
Bregman algorithm similar to (5.35)-(5.40) to com- 
pute a minimiser of (6.1). In short form this reads 



Split Bregman for Anisotropic TV — BH — L 2 
denoising 



k+1 • 1 1 2 

u — argmm -||ito — u\\ 2 



A 



-\\b k +Vu-v 



fc|]2 



+V 2 u-w k \\ 2 , 



v k+1 = argmin 



2 

E 



a \\vt h 



+ ^ll^ + (VH fe+1 ).-^|| 



(6.2) 
(6.3) 

(6.4) 



w k+1 = argmin 




P\\ w e,v\ 



- X \\b k 



2J,v 



(V 2 u 



2„,k+l\ 



Wi 



i>\\2 ' 



b k+L =b k + Vu k+1 - v 



b k+L = b% + V z u 



2„,k+l 



,k + l 
k+1 



(6.5) 

(6.6) 

(6.7) 
(6.8) 



where b\ +1 = (ftf^ 1 ,^ 1 ) € (R nxm ) 2 and b 

llM+l uk+l .fc+1 l,k+l\ G /t| 
I, "2,11' u 2,22' u 2.12j "2,21 7 t V 11 



fc+1 



t" xm ) 4 . The reader should 
compare the algorithm (6.2 1-(|6.6|) with the algorithm 



(5.36|-( 5.38 1. The minimisation problem (6.2) is the 



same with (5.36 ) and it is solved through its optimality 



condition. The minimisation problems (6.4) and (6.6) 



can be solved exactly using shrinkage operators. No- 
tice that the difference to the isotropic case is that v\ 
and V2 are decoupled as are w\\, W22, W12, W2i- Thus, 
if we define: 



(6.1) S|"(o) 



.0 



a G 



then the solutions to these problems are: 

v k+1 ( t , j) = sf {bis, 3) + (v M fc+1 M*, jj) , 
£ = 1,2. 

w k t +\i,j) = Sf {blt v (i,j) + {V 2 u k+X ) lv {i,j)) , 

e,v = 1,2. 

for i = 1, . . . , n and j = 1, . . . , m. 

As in the previous section we tested the algorithm 
for various combinations of the parameters a and f3 al- 
ways choosing the results with optimal SSIM for pre- 



sentation. In Figure 10 the best results from the first 
and the second order anisotropic denoising method for 
an image with Gaussian noise with variance = 0.005 
are displayed. According to SSIM index the best TV- 
denoising result is the one obtained with a — 0.12, 
(SSIM=0.8993). The best result for the second or- 
der denoising is obtained for a = 0.04, (3 — 0.025, 
(SSIM=0.9164). Note that the best SSIM values of the 
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□0[:3 

(a) (u.l.) clean image, (u.r.) image corrupted with Gaussian noise of variance 0.005 
(SSIM=0.3261), (b.l) restored image with a = 0.12, (3 = (SSIM=0.8979 ), (b.r.) restored 
image with a = 0.06, ft = 0.03 (SSIM=0.9081) 




300 



300 



300 



50 100 150 200 250 300 

(b) Middle row slices of the above pictures. From top to bottom, the slices correspond to (u.l), 
(u.r), (b.l) ans (b.r.) images respectively. A decrease of the staircasing effect is noticeable 

Fig. 5 Isotropic denoising. Image corrupted with Gaussian noise of variance 0.005. Number of iterations=300 



images denoised anisotropicaly are slightly better than 
the isotropically denoised ones. The reason for this 
probably is the fact that in the smooth parts of our 
test image the direction of the gradient is parallel to 
the £-axis. Something that can be treated better with 
anisotropic denoising. For a fairer comparison we also 
test both methods for images whose gradient forms 
angles with the s-axis, see Figure [TT] We observe that 
in the first image (first column) the anisotropic denois- 



ing performs slightly better, with SSIM values 0.8197 
versus 0.8153 in the isotropic case. However in the 
second image (second) column where the image gradi- 
ent forms various angles with the s-axis, the isotropic 
method is more efficient with SSIM value 0.8612 ver- 
sus 0.8445 for the anisotropic. 
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(a) (1.) restored image with a = 0.06, j3 ■ 
a = 0.06, P = 0.06 (SSIM=0.8989) 



0.03 (SSIM=0.9081), (r.) restored image with 




300 



50 100 150 200 250 300 

(b) Middle row slices of the above pictures. From top to bottom, the slices correspond to (1.) and (r.) 
images respectively. For higher values of /3 we observe a decrease of the staircasing effect 

Fig. 6 Isotropic denoising. Restored versions of an image corrupted with Gaussian noise of variance 0.005. Number of 
iterations=300. Comparison between the version with the highest SSIM value and another with the same value of a but 
higher value of /3. The result seems more satisfactory in terms of the absence of staircasing, however higher values of (3 
result in slightly blurrier solutions 



6.3 Isotropic denoising with L 1 fidelity term 



As we mentioned in Section [h2] the L 1 norm in the fi- 
delity term leads to more efficient denoising if the im- 
age is corrupted by impulse noise. The impulse noise 
is characterised by the density that counts the num- 
ber of pixels that are affected by additive noise. For 
example, density 0.1 would mean that approximately 
10% of pixels are affected. The noise is uniformly dis- 
tributed on [0, 1]. In this case we consider the discre- 
tised version of the L 1 minimisation problem which 
reads: 



min \\uq — + a|| Vu||i 4- /3|| V 2 u| 



l- 



(6.9) 



t^ 1 = argrnin \\u\\ 1 + ^\\c k + (u k+1 -u )-u\\ 2 2 , ((,.11) 



w fe+1 = axgmin a\\v\\i + '-\\bl + Vu fc+i - v\\%, (6.12) 



A 



k i Y7„,fc + 1 „.I|2 



u- fe+1 =argmin a\\w\\ 1 +^\\b%+V 2 u k+1 -w\\l, (6.13) 

111 ^ 



c k + (u fc+l _ Uq) _ ~fc4 



(6.14) 



Similarly to the algorithms for minimising with 
an L 2 fidelity term, we solve the minimisation prob- 



lem (6.9) using the following alternating minimisation 



scheme: 



Split Bregman for TV BH L 1 denoising 



u k+1 = argmin -||c fc - u k + u - Uq\\\ 

u £ 



k\\2 



\bf+Vu-v k 
\b k +V 2 u-w k \\ 2 2 , 



(6.10) 



6^' +1 =b k + Vu k+1 - v 



k+l 



b k+1 =b k 2 + VV+i - w fcH 



(6.15) 



(6.16) 



As before the problem ( 6.10 ) is solved through its opti- 



mality condition, while the problems ( 6.11 )-( 6.13 ) are 
solved using shrinkage operators. 

In Figure |12[ denoising results for an image cor- 
rupted with impulse noise of density 0.4 are shown. 
The best results are achieved with a = 1.4, SSIM=0.8717 
for the first order method and with a = 0.9, j3 = 
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Fig. 7 Plot of the SSIM and PSNR values of the restored image as functions of a and /?. For display convenience all the 
values under 0.85 (SSIM) and 26 (PSNR) were coloured with dark blue. The dotted cells corresponds to the highest SSIM 
(0.9081) and PSNR (32.39) value that were achieved for a = 0.06, = 0.03 and a = 0.06, = 0.005 respectively. Note that 
the first column in both plots corresponds to TV-denoising, (/3 = 0). The original image was corrupted with Gaussian noise 
of variance 0.005 




Fig. 8 Left: Middle row slices of reconstructed images with a = 0.12, /3 = (blue color) and a = 0.12. /3 = 0.06 (red color). 
Slices of the original image are plotted with black color. Right: Detail of the first plot. Even though the higher order method 
eliminates the staircasing effect, it also results to further slight loss of contrast 



0.24, SSIM=0.8817 using the second order TV-BH ap- 
proach.. The almost absence of the staircasing effect in 
the recovered images with the first order method, does 
not give much space for improvement for the second 
order method. However, the improvement can still be 



defined the convolution to be circular. The minimisa- 



observed when looking at the slices, see Figure 12(b) 



7 Applications in deblurring 

In the case of deblurring the discretised version of the 
minimisation problem of interest is the following: 



U 



Tu\\ 



allVdl 



•/3||V 2 U || 



(7.1) 



where the operator T denotes the discrete convolution 
with a blurring kernel. In our implementation we used 
discrete approximations of Gaussian kernels of vari- 
ance a = 2, 4. In order to compute the convolution, 
we used MATLAB's built in function imfilter and we 



tion procedure of the functional (7.1 1 was described in 
Section 15.31 



In Figure[l3]we see the deblurring results of an im- 
age blurred after a circular convolution with a Gaus- 
sian kernel of variance a = 2. We show the best results 
of the first order deblurring, a = 0.06, SSIM=0.9066 
and the overall best result, a = 0.06, j3 = 0.004, 
SSIM=0.9173. Note that even a relatively small value 
of (3 can reduce significantly the staircasing effect, 
without blurring further the reconstructed image. 

For higher amounts of blur, similar to high amounts 
of noise, the improvement is not so big anymore. In 



Figure 14 we consider the deblurring results for a larger 
blurring kernel of variance a — 4. The best result of 
the first order method was obtained for a = 0.18, 
SSIM=0.8105 while the overall best result was ob- 
tained for a = 0.18, /3 = 0.002, SSIM=0.8129. We ob- 
serve a decrease of the staircasing effect but not a total 
elimination. As in the denoising application a further 
reduction of the staircasing effect can be achieved by 
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Q-:3B-:3 

(a) (u.l.) clean image, (u.r.) image corrupted with Gaussian noise of variance 0.05 
(SSIM=0.1374), (b.l) restored image with a = 0.3, ,3 = (SSIM=0.8162), (b.r.) restored 
image with a = 0.2, /3 = 0.05 (SSIM=0.8226) 
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(b) Middle row slices of the above pictures. From top to bottom, the slices correspond to (u.l), 
(u.r), (b.l) ans (b.r.) images respectively. A decrease of the staircasing effect is visible 

Fig. 9 Isotropic denoising. Image corrupted with Gaussian noise of variance 0.05. Number of iterations=300 



increasing the value of /?, with the penalty of a slightly 
blurrier result, see Figure [15] 



8 Applications in inpainting 

Finally, we present examples for the application of our 
TV-BH approach to image inpainting. There, the goal 
is to reconstruct an image which has a missing part 
using information from the intact part of the image. 



The missing part is a domain D C S7, known as the 
inpainting domain. In this case the operator T will be 

Tu = Xq\ d u. 

In the absence of noise one usually wants to leave the 
Q\D part of the image unchanged. Hence, small val- 
ues of a and /? are chosen. Consequently the number 
of iterations that are needed for the algorithm to con- 
verge is large because the heavy weighting of the fi- 
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(a) (u.l.) clean image, (u.r.) image corrupted with Gaussian noise of variance 0.005 
(SSIM=0.3261), (b.l) restored image with a = 0.12, /3 = (SSIM=0.8983), (b.r.) re- 
stored image with a = 0.04, = 0.025 (SSIM=0.9164) 




Fig. 10 Anisotropic denoising. Image corrupted with Gaussian noise of variance 0.005. Number of iterations=200 



delity term introduces a damping on the iterates of 
the Split Bregman iteration. 



In Figure 16 the results of inpainting for two geo- 
metrical images, a stripe with a large missing part in 
the middle and its 45° tilted version are shown. Again 
we test the isotropic version versus the anisotropic ver- 
sion of the regularising term. Since the missing gap is 
large, none of the TV versions manages to connect the 
strip over the gap whereas in all cases the second or- 



der method manages to connect it. We notice that the 
anisotropic version performs slightly better in the hor- 
izontal stripe but it is unreliable in the tilted stripe. 
In contrast the isotropic version performs in the same 
manner in both cases as one would expect. 

The results for inpainting a real-world image are 
shown in Image |17| The inpainting for the colour im- 
age has been achieved by applying the TV-BH in- 
painting approach to each colour channel r, g and b 
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Fig. 11 Isotropic versus anisotropic second order denoising. The value of parameters was a. = 0.3, /3 = 0.07, number of 
iterations was 200, for all the images. First row: Original images. Second row: Images corrupted with Gaussian noise of 
variance 0.05. Third row: Isotropic second order denoising (SSIM=0.8153, 0.8612 from left to right). Fourth row: Anisotropic 
second order denoising (SSIM=0.8197, 0.8445 from left to right). It is clear that in parts of the image where the gradient 
is parallel or vertical to the x-axis, the anisotropic denoising performs better. This is not the case for general directions of 
the gradient where the isotropic method produces better results 



separately. As it is seen in the zoomed parts in the 
last row's subfigures, the second order method out- 
performs TV, in terms of the connectivity principle. 
Note also that in the second order inpainting, we used 
a small value of a — 10~ 6 . The result looks similar if 
we choose a = 0. 



Computational times 

All the numerical computations were done on a MAC 
OS X 3.06 GHz, 4 GB RAM using MATLAB. In Ta- 
ble [l] we display the computational times in order to 
achieve a relative residual ||ttfe — u k _i ||/||ufe|| < 10~ 4 
for the examples of Figure [5] We compare the compu- 
tational times of the two splitting techniques that were 
discussed in Section [5731 We notice that relative resid- 
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(a) (u.l.) clean image, (u.r.) image corrupted with impulse noise of density 0.4 
(SSIM=0.0880), (b.l) restored image with a = 1.4, = (SSIM=0.8717), (b.r.) restored 
image with a = 0.9, p = 0.24 (SSIM=0.8817) 
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(b) Middle row slices of the above pictures. From top to bottom, the slices correspond to (u.l), 
(u.r), (b.l) and (b.r.) images respectively 

Fig. 12 Impulse noise denoising with L 1 norm in the fidelity term. Image corrupted with impulse noise of density 0.4. 
Number of iterations=500 



ual of order 10 -4 is achieved in at most 60 iterations. 
The fastest time for the second order method is 3.73 
seconds versus 2.66 seconds for TV-minimisation and 
it is achieved using 4 iterations of a preconditioned 
Conjugate Gradient method (peg function in MAT- 
LAB) in order to solve the linear system in each iter- 
ation. The alternative splitting did not seem to pro- 
vide faster convergence. However we believe that we 
can achieve faster computational times by optimizing 



further our code , e.g., implementation in C++, alter- 
native linear system solver. This is a work in progress. 

9 Conclusion 

We formulate a second order variational problem in 
the space of functions of bounded Hessian in the con- 
text of convex functions of measures. We prove exis- 
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D-::G :J 

QoQ:i 

(a) (u.l.) clean image, (u.r.) image convoluted with Gaussian kernel of variance a = 2 
(SSIM=0.8659), (b.l) restored image with a = 0.06, = (SSIM=0.9066), (b.r.) restored 
image with a = 0.06, ft = 0.004 (SSIM=0.9173) 
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(b) Middle row slices of the above pictures. From top to bottom, the slices correspond to (u.l), 
(u.r), (b.l) and (b.r.) images respectively 

Fig. 13 Deblurring. Variance of the convolution kernel is a = 2. Number of iterations = 100. A relatively small value of , 
can reduce significantly the staircasing effect 



tence and uniqueness of minimisers using a relaxation 
technique. We propose the use of the split Bregman 
method for the numerical solution of the analogue dis- 
cretised problem. The application of split Bregman 
method in our model is quite robust and is converging 
after a few iterations. We perform numerical experi- 
ments in denoising, reconstructing images that have 
been corrupted with Gaussian and impulse noise, in 



deblurring deconvoluting images that have been con- 
voluted with Gaussian kernels, as well as in image in- 
painting. 

In every case, the introduction of the second or- 
der term leads to a significant reduction of the stair- 
casing effect resulting more in piecewise smooth im- 
ages rather than piecewise constant as in images re- 
constructed using the ROF model. The superiority of 
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(a) (u.l.) clean image, (u.r.) image convoluted with Gaussian kernel of variance <j = 4 
(SSIM=0.7392), (b.l) restored image with a = 0.18, /3 = (SSIM=0.8105), (b.r.) restored 
image with a = 0.18, ft = 0.002 (SSIM=0.8129) 
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(b) Middle row slices of the above pictures. From top to bottom, the slices correspond to (u.l), 
(u.r), (b.l) and (b.r.) images respectively 

Fig. 14 Deblurring. Variance of the convolution kernel is a = 4. Number of iterations = 100. A relatively small value of , 
reduces the staircasing effect 



using a combination of the first and the second order 
rather than the first order only, is confirmed quanti- 
tatively by the SSIM index. In the case of inpainting 
the higher order able is able to connect edges along 
large gaps something that TV-inpainting is incapable 
of. 

As fas as future work is concerned, there is some 
room for improvement concerning the optimality of 



the programming code, as our implementation is rel- 
atively slow, especially in deblurring. Moreover, the 
relation between the continuum and the discretised 
model could be investigated through _T-convergence 
arguments, see [TH] and [7]. Moreover, the characteri- 
sation of subgradients of this approach and the anal- 
ysis of solutions of the corresponding PDE flows for 
different choices of functions f and g promises to give 
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(a) (1.) restored image with best SSIM a = 0.18, = 0.002 (SSIM=0.8129), (r.) restored 
image with higher /3, a = 0.18, = 0.004 (SSIM=0.7894) 




Fig. 15 Further reduction of the staircasing effect by increasing the value of j3. Variance of the Gaussian convolution kernel 
is a = 4 



Method 


Splitting 


Lin. System Solver 


Seconds 


No. of Iterations 


Rel. Error 


TV 


standard 


LU 


6.49 


54 


0.28 % 


TV 


standard 


PCG 


2.66 


51 


0.26 % 


TV-BH 


standard 


LU 


7.53 


47 


0.20 % 


TV-BH 


standard 


PCG 


3.73 


37 


0.33 % 


TV-BH 


alternative 


LU 


20.07 


61 


0.31% 


TV-BH 


alternative 


PCG 


10.77 


58 


0.33 % 



Table 1 Computational times for the two examples of Figure p| to reach a relative residual — Wfc_x||/||itfc|| < 10~ 4 . 
The linear system solver is either LU-Factorisation or a Preconditioned Conjugate Gradient method (MATLAB's built in 
function) with stopping criteria either maximum 4 iterations or residual less than 10~ 4 . The last column of the table shows 
the relative error between the results and the "correct" solution, where the correct solutions are chosen the ones that are 
shown in Figure [5] (300 iterations) 



more insight into the qualitative properties of this reg- 
ularisation procedure. Higher order models in image 
reconstruction form currently an active field of re- 
search. During the last years many methods have been 
introduced, the most important of which are men- 
tioned in the Section [L"2| It will be interesting to com- 
pare these methods with ours in particular with the 
new notion of total generalised variation TGV which 
has produced some very good results, see [5]. 
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A Some results about convex functions of 
measures 

Proposition A.l. Suppose that g : R m — > R is continu- 
ous function, positively homogeneous of degree 1 and let /i £ 
[A / l(l?)] m . Then for every positive measure Radon measure v 
such that fjj is absolutely continuous with respect to v, we have 

gM :=9 (r) w=5 O v ' 

Moreover, if g is a convex function, then g : [A4(f2)] m — > 
Ai(O) is a convex function as well. 

Proof. Since pi <C v, we have that \n\ <C v. Using the fact 
that g is positively homogenous we get 

*>"(r>'=»(r)v— (0)— O- 
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(a) (1.) Original image, (r.) Inpainting domain (b) (1.) Original image, 45° tilted version of (a), 
(grey) (r.) Inpainting domain (grey) 




(c) (1.) Anisotropic TV-inpainting, a = 0.12, p = 0, (d) (1.) Anisotropic TV-inpainting, a = 0.12, p = 
4000 iterations, (r.) Anisotropic second order in- 0, 4000 iterations, (r.) Anisotropic second order in- 
painting, a = 10 -4 , P = 0.01, 1500 iterations painting, a = 10~ 4 , P = 0.01, 1500 iterations 




(e) (1.) Isotropic TV-inpainting, a = 0.12, p = 0, (f) (1.) Isotropic TV-inpainting, a = 0.12, ,9 = 0, 
4000 iterations, (r.) Isotropic second order inpaint- 4000 iterations, (r.) Isotropic second order inpaint- 
ing, a = 10" 4 , P = 0.01, 1500 iterations ing, a = 10~ 4 , P = 0.01, 1500 iterations 



Fig. 16 Inpainting geometrical images 



Assuming that g is convex and using the first part of the 
proposition we get for < A < 1, fj,, v 6 [M{Q)] m : 

9 (A, + (l-A),)= 9 (^±|i^)|A, + ( l-A H 

= 9 ( x rrzn + V- - ^rrzn) + H) 
^ A9 (RTH) (H + H) + (1 - A)3 (RfR) (H + H) 

= \g(n) + (l-\)g(u). 

□ 

The following theorem which is a special case of a the- 
orem that was proved in 1 1 1 1 and can be also found in [5] 
establishes the lower semicontinuity of convex functionals of 
measures with respect to the weak* convergence. 

Theorem A. 2 (Buttazzo-Freddi, 1991). Let Q be an open 
subset o/R n , v, (u k ) k ^ be R m -valued finite Radon measures 



and fi, (ft)fegn be positive Radon measures in Q. Let g : R m — > 
R be a convex function and suppose that u k — > v and fi k — > 
H weakly* in Q. Consider the Lebesgue decompositions v = 
{iy/fi)ti + v s , v k = (v k /fi k )[i k + f|, k e N. Then 

L g {i~ {x) ) Mx) + L goc (i5i (:r) ) dK|(:r) - 

liminf f g f — (x)) d» k [x) + f g x (-^r(x)) d\v° k \(x). 
Jn W / J a \\K\ J 

In particular, if /i = fi k = C n for all k a N then according 
to the definition \2.1\ the above inequality can be written as 
follows: 

g{v){Q) < \iramig{v k ){Q). 

re— ^oo 

The following theorem is a special case of Theorem 2.3 
in [22] . 

Theorem A. 3 (Demengel-Temam, 1984). Suppose that Q C 
R 71 is open, with Lipschitz boundary and let g be a convex 
function from R" x ™ to R with at most linear growth at in- 
finity. Then for every u £ BH(O) there exists a sequence 
("fc)fc£N C C*°°(I2) n W 2 ' 1 {Q) such that 

IK - -> 0, \D 2 u k \(Q) -> |D 2 «|(^), 

g(D 2 u k )(!1) -> g(D 2 u)(f2), as k^too. 
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Fig. 17 Inpainting a real world image: Sotiris the turtle, (u.l) original image, (u.r.) destroyed image and initialisation of the 
algorithm, (m.l) TV-inpainting, a = 0.001, ft = 0, 7000 iterations, (m.r.) second order inpainting, a = 10 -6 , ft = 0.0005, 
5000 iterations. In the last rows we have enlarged some details of the two reconstructed images 



B Weak continuity and differentiability 
notions in BV 

The purpose of this appendix is to remind the reader of the 
notions of approximate limits, continuity, jump points and 
differentiability of BV functions. We remind also the decom- 
position of the distributional derivative into an absolutely 
continuous part, a jump part and a Cantor part. We assume 
that Q C 1™ is open with Lipschitz boundary. 

Definition B.l (Approximate limit-approximate continu- 
ity). Let u £ [L 1 (i7)] m . We say that u has an approximate 
limit at x £ (2 if there exists a z £ R m such that 

lin ?, ml m / Mf) ~ z \ dy = °- 
P^O \Bp{x)\ J Bp (x) 

We denote with S u the set of points where this property does 
not hold. The vector z, which is unique if it exists, is denoted by 
u[x). Ifu(x) = u(x) we say that u is approximately continuous 
in x . 

Definition B.2 (Approximate jump point). We say that x 
is an approximate jump point if there exist a, b S K m and 
v e S' 1-1 = {x e R n : \x\ = 1} such that a ^ b and 



lim 



i"- 1 = {x e 
i 



p^o \B+{x,u)\ Jb+( x ,u) 



\u(y) -a\ dy = 0, 



p->o I B p ( X: u)\ J B -{x,u) 
with 

B+(x) = {yeB p (x): (y-x,u)>0}, 
Bp{x) = {y(LB p {x): {y-x,v)<0}, 

where (-, ■) denotes the Euclidean inner product. The set of all 
approximate jump points is denoted by J u . The triplet (a,b,v) 
will be denoted by (w+(x), u~ (cc), u u (x)). 

Definition B.3 (Approximate differentiability). Let 

u S [L (l?)] m . We say that u is approximately differentiable at 
a point x £ Q \ S u if there is a linear function L : R™ — > R m 
such that 



lim 



p^o \B p (x)\ J Bp(x) 



\u(y) - u(x) - L(y - x)\ 



dy = 0. 



The linear functional L is represented by an rax n matrix which 
is called the approximate differential of f at x and will be de- 
noted by V«(x) . 

Note that the approximate differential is a linear func- 
tion of u. It is a fact that if u 6 [BV{Q)] m then ri n ~ 1 {S u \ 
Ju) = 0. Moreover, Calderon-Zygmund theorem says that 
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u is approximately differentiable at £ n -almost every point 
of Q and the approximate differential Vu is the density of 
the absolutely continuous part of Du with respect to C n 
i.e. Du = Vu£° + D s u. Consequently we have the following 
decomposition for the derivative of u: 

Du = D a u + D a u = VuC n + D J u + D c u, 

where D^u := D B u f J u , the restriction of D a u on J u , and 
D c u := D s u \ (J? \ S u ) are the jump and the Cantor part of 
the derivative respectively. Moreover it can be shown that 
the jump part is equal to 

D ] (B) = I (u+ (x)-u~ {x))®u a (x) dH n - 1 {x), VB e B{Q). 
Jflnj„ 

We recall that if a e R m and b e K" then a ® 6 denotes the 
m x n matrix with (ij)-th entry a;fej. This implies that 

\Du\(n) = / \Vu(x)\dx + \u+(x) - u~(x)\ dn n ' 1 (x) 

+ \D c u\(Q\S u ). (B.l) 



divHi 2 (wi2)(«, j) = ' 



wi2(i — l,j)—wi2{i,j) if 1 < i, 

-u>i 2 (i- l,j + 1) i<m 
+^12 («> j + 1) 

Wi2(«,i + 1) — Wi2(i,i) if£=l, 



w 12 (i - l,j) 
-w 12 (i - l,j + 1) 
wi 2 {i,j + 1) 
-w 12 (i - l,j + 1) 
w\ 2 (i — l,j) — w\ 2 (i,j) if 1 < i < n, 
3 = m, 

u>i2(i,3 + l) if 



1 < j < m, 
if i = n, 

1 < j < m, 
if 1 < i < n, 
3 = 1, 



-u>i 2 (i,j) 
-u>i 2 (i -l,j + 1) 
w 12 (i - l,j) 



if 



3 = m, 



C Formulation of the discrete second 
divergence operators 

In this appendix we define the first and second discrete di- 
vergence operators div and divH, that have the properties: 



div : (K nxm ) z -» 

Vu £ R nxm , v e ( 
divH : (K nxm ) 4 - 
w £ 



with 



- div(u) ■ u = v ■ Vu, 



with divH(io) ■ u = w ■ Hu, 



The formulation we mention here is due to ]E\. For v = 
(«i,«a) £ (M nxm ) 2 we have that div(t>) = divi(t)i)+div 2 (t) 2 ) 
where divi(ui), div2(u2) belong to R nxm and 



—vi(i,j— 1) ifl<j'<m, 



divi(^i)(i,i) = 



vi(i,j) 
-vi(i,j - 1) 



if 3 = 1, 
if j = m, 



divH 2 i(M>2i)(«,j) = < 



' ui 21 (i,j - 1) - w 21 (i,j) if l<i, 
-w 21 (i + 1, j - 1) j<m, 
+w 21 (i + 

u> 2 i(i -\- 1, j) if i = 1, 

— u> 2 i(i + 1, j — 1) 1 < j < m, 

u>2l(i,j- 1) — W2i(i>i) if« = n, 

1 < j < m, 

">2i(* + 1, j) — «>2i(«, j) if 1 < i < n, 
3 = 1 
if 1 < i < n, 



w 2 i{i,j - 1) 
— u>2i(£ + 1,3 - 1) 
+ l,j) 

—W2l(i + 1,3 - 1) 

—W2l(i,j) 

mi(i,3 - 1) 



3 


= m, 


if i 


= 1, 


3 


= 1, 


if i 


= 1, 


J 


= m, 


if i 


— n , 




= 1 


if j 


= n, 


3 


= m. 



! v 2(i,j) — v 2 (i — if 1 < i < n, 

v 2 (i,j) if i = 1, 

- u 2 (i — l,i) if i = n. 

For the discrete second divergence divH we have that for 
w = {wn,uii 2 ,w 2 i,w 22 ) £ (R nxm ) 4 we have that divH(wj) = 
divHii(uin) + divHi2(uii2) + divH2i(u)2i) + divH22("!«22) 
where divHn(ifln), divHi 2 («'i2), divH 2 i(i«2i), divH 2 2(«'22) 
belong to R nxm and 



divHn(wn)(i,i) 



'wn(i,j - 1) - 2wn(i,j) if 1 < j < m, 
+wn(i,j + 1) 

+ 1) — u>li{i,j) ifj = l, 

^w\x(i,3 — 1) — uin(i, i 7') if i = m, 



D Some useful theorems 

Lemma D.l (Kronecker's lemma). Suppose that (a„)„ 6 m 
and (fcn)ngN are two sequences of real numbers such that "}2^ =1 a 
oo and < 6i < b 2 < . . . tuit/i b n — > oo. Then 



1 " 

r- I] ^"fc -> 0, 



as n — > oo. 



/n particular, if (c n ) ng pj is a decreasing positive real sequence 
such that < oo, 4ft,en 

n 

c n y ' Cfc — > 0, as n — > oo. 



divH22(«>22)(*, j) = 



«J22(i — l,j) — 2u! 22 {i,j) 

+w 22 {i + 

w 22 (i + - w 22 (i,j) 
w 22 (i — 1, j) — w 22 (i,j) 



if 1 < i < n, 

if j = 1, 
if j = n, 
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